All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 00/13] ARC: handle the lack of ZOL support
@ 2022-02-22 14:14 Sergey Matyukevich
  2022-02-22 14:14 ` [RFC PATCH 01/13] ARC: uaccess: elide unaligned handling if hardware supports Sergey Matyukevich
                   ` (13 more replies)
  0 siblings, 14 replies; 17+ messages in thread
From: Sergey Matyukevich @ 2022-02-22 14:14 UTC (permalink / raw)
  To: linux-snps-arc
  Cc: Vineet Gupta, Vladimir Isaev, Sergey Matyukevich, Sergey Matyukevich

From: Sergey Matyukevich <sergey.matyukevich@synopsys.com>

Hi Vineet and all,

This patch series continues to prepare arch/arc for the upcoming ARCv3
support. ARCv3 does not support zero-overhead-loop (ZOL). So this patch
series provides a set of changes that make ZOL support optional.

The patch series is based on top of Linux 5.17-rc5. It has been tested
with enabled CONFIG_ARC_LACKS_ZOL option on ARCv2 HSDK hardware as well
as on nSIM simulator for ARCv2.

I fixed typos, updated Vineet's email address, and slightly modified
several commit messages. Otherwise this patch series is the first chunk
of ARCv3 bring-up changes by Vineet, available at Synopsys github: see
github.com/foss-for-synopsys-dwc-arc-processors/linux

Regards,
Sergey

Vineet Gupta (13):
  ARC: uaccess: elide unaliged handling if hardware supports
  ARC: Kconfig: introduce option to disable ZOL
  ARC: uaccess: drop CC_OPTIMIZE_FOR_SIZE
  ARC: uaccess: elide ZOL, use double load/stores
  ARCv2: memset: don't prefetch for len == 0 which happens a lot
  ARCv2: memset: elide unaligned handling if hardware supports
  ARCv2: memset: rewrite using double load/stores
  ARC: string: use generic C code if no ZOL support
  ARC: delay: elide ZOL
  ARC: checksum: elide ZOL
  ARC: head: elide ZOL
  ARC: build: inhibit ZOL generation by compiler
  ARC: pt_regs: handle the case when ZOL is not supported

 arch/arc/Kconfig                           |  10 ++
 arch/arc/Makefile                          |   3 +
 arch/arc/include/asm/asm-macro-dbnz-emul.h |  12 ++
 arch/arc/include/asm/asm-macro-dbnz.h      |   8 ++
 arch/arc/include/asm/asm-macro-ll64-emul.h |  31 +++++
 arch/arc/include/asm/asm-macro-ll64.h      |  20 +++
 arch/arc/include/asm/assembler.h           |  41 ++++++
 arch/arc/include/asm/checksum.h            |  58 +++++++-
 arch/arc/include/asm/delay.h               |  16 +++
 arch/arc/include/asm/entry-arcv2.h         |   4 +
 arch/arc/include/asm/entry.h               |   2 +
 arch/arc/include/asm/ptrace.h              |   4 +-
 arch/arc/include/asm/string.h              |  15 ++-
 arch/arc/include/asm/uaccess.h             |  29 ++--
 arch/arc/kernel/arcksyms.c                 |   2 +
 arch/arc/kernel/asm-offsets.c              |   2 +
 arch/arc/kernel/disasm.c                   |   2 +
 arch/arc/kernel/head.S                     |   8 +-
 arch/arc/kernel/intc-arcv2.c               |   2 +
 arch/arc/kernel/kgdb.c                     |   4 +
 arch/arc/kernel/process.c                  |   2 +
 arch/arc/kernel/ptrace.c                   |  12 ++
 arch/arc/kernel/signal.c                   |   8 ++
 arch/arc/kernel/troubleshoot.c             |   3 +
 arch/arc/kernel/unaligned.c                |   2 +
 arch/arc/kernel/vmlinux.lds.S              |   2 +-
 arch/arc/lib/Makefile                      |   6 +
 arch/arc/lib/memset-archs.S                | 147 +++++++++------------
 arch/arc/lib/uaccess.S                     | 144 ++++++++++++++++++++
 arch/arc/mm/extable.c                      |  11 --
 30 files changed, 493 insertions(+), 117 deletions(-)
 create mode 100644 arch/arc/include/asm/asm-macro-dbnz-emul.h
 create mode 100644 arch/arc/include/asm/asm-macro-dbnz.h
 create mode 100644 arch/arc/include/asm/asm-macro-ll64-emul.h
 create mode 100644 arch/arc/include/asm/asm-macro-ll64.h
 create mode 100644 arch/arc/include/asm/assembler.h
 create mode 100644 arch/arc/lib/uaccess.S

-- 
2.25.1


_______________________________________________
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [RFC PATCH 01/13] ARC: uaccess: elide unaligned handling if hardware supports
  2022-02-22 14:14 [RFC PATCH 00/13] ARC: handle the lack of ZOL support Sergey Matyukevich
@ 2022-02-22 14:14 ` Sergey Matyukevich
  2022-02-22 14:14 ` [RFC PATCH 02/13] ARC: Kconfig: introduce option to disable ZOL Sergey Matyukevich
                   ` (12 subsequent siblings)
  13 siblings, 0 replies; 17+ messages in thread
From: Sergey Matyukevich @ 2022-02-22 14:14 UTC (permalink / raw)
  To: linux-snps-arc
  Cc: Vineet Gupta, Vladimir Isaev, Sergey Matyukevich, Sergey Matyukevich

From: Vineet Gupta <vgupta@kernel.org>

Signed-off-by: Vineet Gupta <vgupta@kernel.org>
---
 arch/arc/include/asm/uaccess.h | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/arch/arc/include/asm/uaccess.h b/arch/arc/include/asm/uaccess.h
index 783bfdb3bfa3..d78aae76831f 100644
--- a/arch/arc/include/asm/uaccess.h
+++ b/arch/arc/include/asm/uaccess.h
@@ -175,8 +175,9 @@ raw_copy_from_user(void *to, const void __user *from, unsigned long n)
 	if (n == 0)
 		return 0;
 
-	/* unaligned */
-	if (((unsigned long)to & 0x3) || ((unsigned long)from & 0x3)) {
+	/* fallback for unaligned access when hardware doesn't support */
+	if (!IS_ENABLED(CONFIG_ARC_USE_UNALIGNED_MEM_ACCESS) &&
+	     (((unsigned long)to & 0x3) || ((unsigned long)from & 0x3))) {
 
 		unsigned char tmp;
 
@@ -402,8 +403,9 @@ raw_copy_to_user(void __user *to, const void *from, unsigned long n)
 	if (n == 0)
 		return 0;
 
-	/* unaligned */
-	if (((unsigned long)to & 0x3) || ((unsigned long)from & 0x3)) {
+	/* fallback for unaligned access when hardware doesn't support */
+	if (!IS_ENABLED(CONFIG_ARC_USE_UNALIGNED_MEM_ACCESS) &&
+	     (((unsigned long)to & 0x3) || ((unsigned long)from & 0x3))) {
 
 		unsigned char tmp;
 
-- 
2.25.1


_______________________________________________
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [RFC PATCH 02/13] ARC: Kconfig: introduce option to disable ZOL
  2022-02-22 14:14 [RFC PATCH 00/13] ARC: handle the lack of ZOL support Sergey Matyukevich
  2022-02-22 14:14 ` [RFC PATCH 01/13] ARC: uaccess: elide unaligned handling if hardware supports Sergey Matyukevich
@ 2022-02-22 14:14 ` Sergey Matyukevich
  2022-02-22 14:14 ` [RFC PATCH 03/13] ARC: uaccess: drop CC_OPTIMIZE_FOR_SIZE Sergey Matyukevich
                   ` (11 subsequent siblings)
  13 siblings, 0 replies; 17+ messages in thread
From: Sergey Matyukevich @ 2022-02-22 14:14 UTC (permalink / raw)
  To: linux-snps-arc
  Cc: Vineet Gupta, Vladimir Isaev, Sergey Matyukevich, Sergey Matyukevich

From: Vineet Gupta <vgupta@kernel.org>

Upcoming ARCv3 lacks ZOL support, so provide alternatives
based on DBNZ instruction inrtroduced in ARCv2.

Signed-off-by: Vineet Gupta <vgupta@kernel.org>
---
 arch/arc/Kconfig                           | 10 ++++++++
 arch/arc/Makefile                          |  1 +
 arch/arc/include/asm/asm-macro-dbnz-emul.h | 12 +++++++++
 arch/arc/include/asm/asm-macro-dbnz.h      |  8 ++++++
 arch/arc/include/asm/assembler.h           | 29 ++++++++++++++++++++++
 5 files changed, 60 insertions(+)
 create mode 100644 arch/arc/include/asm/asm-macro-dbnz-emul.h
 create mode 100644 arch/arc/include/asm/asm-macro-dbnz.h
 create mode 100644 arch/arc/include/asm/assembler.h

diff --git a/arch/arc/Kconfig b/arch/arc/Kconfig
index 3c2a4753d09b..9daef7c763ce 100644
--- a/arch/arc/Kconfig
+++ b/arch/arc/Kconfig
@@ -341,6 +341,16 @@ config ARC_HAS_SWAPE
 
 if ISA_ARCV2
 
+config ARC_LACKS_ZOL
+	bool "Disable Zero Delay hardware loops"
+	help
+	  ARC CPU historically have had ZOL hardware loop mechanism which
+	  the ARCv3 ISA drops. Architecturally ZOL provides
+	    - LPcc instruction
+	    - LP_COUNT core reg
+	    - LP_START, LP_END aux regs
+	  This optional removes any use of ZOL instructions/regs from code
+
 config ARC_USE_UNALIGNED_MEM_ACCESS
 	bool "Enable unaligned access in HW"
 	default y
diff --git a/arch/arc/Makefile b/arch/arc/Makefile
index efc54f3e35e0..ec0f672bcee6 100644
--- a/arch/arc/Makefile
+++ b/arch/arc/Makefile
@@ -39,6 +39,7 @@ LINUXINCLUDE	+=  -include $(srctree)/arch/arc/include/asm/current.h
 endif
 
 cflags-y				+= -fsection-anchors
+cflags-y				+= -Wa,-I$(srctree)/arch/arc/include
 
 cflags-$(CONFIG_ARC_HAS_LLSC)		+= -mlock
 cflags-$(CONFIG_ARC_HAS_SWAPE)		+= -mswape
diff --git a/arch/arc/include/asm/asm-macro-dbnz-emul.h b/arch/arc/include/asm/asm-macro-dbnz-emul.h
new file mode 100644
index 000000000000..8c89f4234408
--- /dev/null
+++ b/arch/arc/include/asm/asm-macro-dbnz-emul.h
@@ -0,0 +1,12 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+
+/*
+ * DBNZ emulation for ARCompact or earlier ARCv2 cores
+ * 2 byte short instructions used to keep code size same as 4 byte DBNZ.
+ * This warrants usage of r0-r3, r12-r15, gas barfs otherwise catching
+ * offenders immediately
+ */
+.macro DBNZR r, lbl
+	sub_s  \r, \r, 1
+	brne_s \r, 0, \lbl
+.endm
diff --git a/arch/arc/include/asm/asm-macro-dbnz.h b/arch/arc/include/asm/asm-macro-dbnz.h
new file mode 100644
index 000000000000..fe658d2eab51
--- /dev/null
+++ b/arch/arc/include/asm/asm-macro-dbnz.h
@@ -0,0 +1,8 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+
+/*
+ * DBNZ instruction introduced in ARCv2
+ */
+.macro DBNZR r, lbl
+	dbnz  \r, \lbl
+.endm
diff --git a/arch/arc/include/asm/assembler.h b/arch/arc/include/asm/assembler.h
new file mode 100644
index 000000000000..426488ef27d4
--- /dev/null
+++ b/arch/arc/include/asm/assembler.h
@@ -0,0 +1,29 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+
+#ifndef __ASM_ARC_ASM_H
+#define __ASM_ARC_ASM_H 1
+
+#ifdef __ASSEMBLY__
+
+#ifdef CONFIG_ARC_LACKS_ZOL
+#include <asm/asm-macro-dbnz.h>
+#else
+#include <asm/asm-macro-dbnz-emul.h>
+#endif
+
+#else	/* !__ASSEMBLY__ */
+
+/*
+ * ARCv2 cores have both LPcc and DBNZ instructions (starting 3.5a release).
+ * But in this context, LP present implies DBNZ not available (ARCompact ISA)
+ * or just not desirable, so emulate DBNZ with base instructions.
+ */
+#ifdef CONFIG_ARC_LACKS_ZOL
+asm(".include \"asm/asm-macro-dbnz.h\"\n");
+#else
+asm(".include \"asm/asm-macro-dbnz-emul.h\"\n");
+#endif
+
+#endif	/* __ASSEMBLY__ */
+
+#endif
-- 
2.25.1


_______________________________________________
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [RFC PATCH 03/13] ARC: uaccess: drop CC_OPTIMIZE_FOR_SIZE
  2022-02-22 14:14 [RFC PATCH 00/13] ARC: handle the lack of ZOL support Sergey Matyukevich
  2022-02-22 14:14 ` [RFC PATCH 01/13] ARC: uaccess: elide unaligned handling if hardware supports Sergey Matyukevich
  2022-02-22 14:14 ` [RFC PATCH 02/13] ARC: Kconfig: introduce option to disable ZOL Sergey Matyukevich
@ 2022-02-22 14:14 ` Sergey Matyukevich
  2022-02-22 14:14 ` [RFC PATCH 04/13] ARC: uaccess: elide ZOL, use double load/stores Sergey Matyukevich
                   ` (10 subsequent siblings)
  13 siblings, 0 replies; 17+ messages in thread
From: Sergey Matyukevich @ 2022-02-22 14:14 UTC (permalink / raw)
  To: linux-snps-arc
  Cc: Vineet Gupta, Vladimir Isaev, Sergey Matyukevich, Sergey Matyukevich

From: Vineet Gupta <vgupta@kernel.org>

Currently ARC uses CC_OPTIMIZE_FOR_PERFORMANCE_O3, which excludes
CC_OPTIMIZE_FOR_SIZE. So drop unused define branch.

Signed-off-by: Vineet Gupta <vgupta@kernel.org>
---
 arch/arc/include/asm/uaccess.h | 11 ++---------
 arch/arc/mm/extable.c          | 11 -----------
 2 files changed, 2 insertions(+), 20 deletions(-)

diff --git a/arch/arc/include/asm/uaccess.h b/arch/arc/include/asm/uaccess.h
index d78aae76831f..9b009e64e79c 100644
--- a/arch/arc/include/asm/uaccess.h
+++ b/arch/arc/include/asm/uaccess.h
@@ -615,7 +615,7 @@ raw_copy_to_user(void __user *to, const void *from, unsigned long n)
 	return res;
 }
 
-static inline unsigned long __arc_clear_user(void __user *to, unsigned long n)
+static inline unsigned long __clear_user(void __user *to, unsigned long n)
 {
 	long res = n;
 	unsigned char *d_char = to;
@@ -657,17 +657,10 @@ static inline unsigned long __arc_clear_user(void __user *to, unsigned long n)
 	return res;
 }
 
-#ifndef CONFIG_CC_OPTIMIZE_FOR_SIZE
-
 #define INLINE_COPY_TO_USER
 #define INLINE_COPY_FROM_USER
 
-#define __clear_user(d, n)		__arc_clear_user(d, n)
-#else
-extern unsigned long arc_clear_user_noinline(void __user *to,
-		unsigned long n);
-#define __clear_user(d, n)		arc_clear_user_noinline(d, n)
-#endif
+#define __clear_user		__clear_user
 
 #include <asm/segment.h>
 #include <asm-generic/uaccess.h>
diff --git a/arch/arc/mm/extable.c b/arch/arc/mm/extable.c
index 4e14c4244ea2..88fa3a4d4906 100644
--- a/arch/arc/mm/extable.c
+++ b/arch/arc/mm/extable.c
@@ -22,14 +22,3 @@ int fixup_exception(struct pt_regs *regs)
 
 	return 0;
 }
-
-#ifdef CONFIG_CC_OPTIMIZE_FOR_SIZE
-
-unsigned long arc_clear_user_noinline(void __user *to,
-		unsigned long n)
-{
-	return __arc_clear_user(to, n);
-}
-EXPORT_SYMBOL(arc_clear_user_noinline);
-
-#endif
-- 
2.25.1


_______________________________________________
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [RFC PATCH 04/13] ARC: uaccess: elide ZOL, use double load/stores
  2022-02-22 14:14 [RFC PATCH 00/13] ARC: handle the lack of ZOL support Sergey Matyukevich
                   ` (2 preceding siblings ...)
  2022-02-22 14:14 ` [RFC PATCH 03/13] ARC: uaccess: drop CC_OPTIMIZE_FOR_SIZE Sergey Matyukevich
@ 2022-02-22 14:14 ` Sergey Matyukevich
  2022-02-22 14:14 ` [RFC PATCH 05/13] ARCv2: memset: don't prefetch for len == 0 which happens a lot Sergey Matyukevich
                   ` (9 subsequent siblings)
  13 siblings, 0 replies; 17+ messages in thread
From: Sergey Matyukevich @ 2022-02-22 14:14 UTC (permalink / raw)
  To: linux-snps-arc
  Cc: Vineet Gupta, Vladimir Isaev, Sergey Matyukevich, Sergey Matyukevich

From: Vineet Gupta <vgupta@kernel.org>

Upcoming ARCv3 lacks ZOL support, so provide alternative
uaccess implementations based on 64-bit memory operations.

Signed-off-by: Vineet Gupta <vgupta@kernel.org>
---
 arch/arc/include/asm/asm-macro-ll64-emul.h |  28 ++++
 arch/arc/include/asm/asm-macro-ll64.h      |  20 +++
 arch/arc/include/asm/assembler.h           |  12 ++
 arch/arc/include/asm/uaccess.h             |  12 ++
 arch/arc/lib/Makefile                      |   2 +
 arch/arc/lib/uaccess.S                     | 144 +++++++++++++++++++++
 6 files changed, 218 insertions(+)
 create mode 100644 arch/arc/include/asm/asm-macro-ll64-emul.h
 create mode 100644 arch/arc/include/asm/asm-macro-ll64.h
 create mode 100644 arch/arc/lib/uaccess.S

diff --git a/arch/arc/include/asm/asm-macro-ll64-emul.h b/arch/arc/include/asm/asm-macro-ll64-emul.h
new file mode 100644
index 000000000000..886320cc74ad
--- /dev/null
+++ b/arch/arc/include/asm/asm-macro-ll64-emul.h
@@ -0,0 +1,28 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+
+/*
+ * Abstraction for 64-bit load/store:
+ *   - Emulate 64-bit access with two 32-bit load/stores.
+ *   - In the non-emulated case, output register pair r<N>:r<N+1>
+ *     so macro takes only 1 output arg and determines the 2nd.
+ */
+
+.macro ST64.ab d, s, incr
+	st.ab	\d, [\s, \incr / 2]
+	.ifeqs	"\d", "r4"
+		st.ab	r5, [\s, \incr / 2]
+	.endif
+	.ifeqs	"\d", "r6"
+		st.ab	r7, [\s, \incr / 2]
+	.endif
+.endm
+
+.macro LD64.ab d, s, incr
+	ld.ab	\d, [\s, \incr / 2]
+	.ifeqs	"\d", "r4"
+		ld.ab	r5, [\s, \incr / 2]
+	.endif
+	.ifeqs	"\d", "r6"
+		ld.ab	r7, [\s, \incr / 2]
+	.endif
+.endm
diff --git a/arch/arc/include/asm/asm-macro-ll64.h b/arch/arc/include/asm/asm-macro-ll64.h
new file mode 100644
index 000000000000..89e05c923a26
--- /dev/null
+++ b/arch/arc/include/asm/asm-macro-ll64.h
@@ -0,0 +1,20 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+
+/*
+ * Abstraction for 64-bit load/store:
+ *   - Single instruction to double load/store
+ *   - output register pair r<N>:r<N+1> but only
+ *     first register needs to be specified
+ */
+
+.irp xx,,.ab
+.macro ST64\xx d, s, off=0
+	std\xx	\d, [\s, \off]
+.endm
+.endr
+
+.irp xx,,.ab
+.macro LD64\xx d, s, off=0
+	ldd\xx	\d, [\s, \off]
+.endm
+.endr
diff --git a/arch/arc/include/asm/assembler.h b/arch/arc/include/asm/assembler.h
index 426488ef27d4..1d69390c22ba 100644
--- a/arch/arc/include/asm/assembler.h
+++ b/arch/arc/include/asm/assembler.h
@@ -5,6 +5,12 @@
 
 #ifdef __ASSEMBLY__
 
+#ifdef CONFIG_ARC_HAS_LL64
+#include <asm/asm-macro-ll64.h>
+#else
+#include <asm/asm-macro-ll64-emul.h>
+#endif
+
 #ifdef CONFIG_ARC_LACKS_ZOL
 #include <asm/asm-macro-dbnz.h>
 #else
@@ -13,6 +19,12 @@
 
 #else	/* !__ASSEMBLY__ */
 
+#ifdef CONFIG_ARC_HAS_LL64
+asm(".include \"asm/asm-macro-ll64.h\"\n");
+#else
+asm(".include \"asm/asm-macro-ll64-emul.h\"\n");
+#endif
+
 /*
  * ARCv2 cores have both LPcc and DBNZ instructions (starting 3.5a release).
  * But in this context, LP present implies DBNZ not available (ARCompact ISA)
diff --git a/arch/arc/include/asm/uaccess.h b/arch/arc/include/asm/uaccess.h
index 9b009e64e79c..f5b97d977c1b 100644
--- a/arch/arc/include/asm/uaccess.h
+++ b/arch/arc/include/asm/uaccess.h
@@ -163,6 +163,7 @@
 	: "+r" (ret)				\
 	: "r" (src), "r" (dst), "ir" (-EFAULT))
 
+#ifndef CONFIG_ARC_LACKS_ZOL
 
 static inline unsigned long
 raw_copy_from_user(void *to, const void __user *from, unsigned long n)
@@ -660,6 +661,17 @@ static inline unsigned long __clear_user(void __user *to, unsigned long n)
 #define INLINE_COPY_TO_USER
 #define INLINE_COPY_FROM_USER
 
+#else
+
+extern unsigned long raw_copy_from_user(void *to, const void __user *from,
+					  unsigned long n);
+extern unsigned long raw_copy_to_user(void *to, const void __user *from,
+					unsigned long n);
+
+extern unsigned long __clear_user(void __user *to, unsigned long n);
+
+#endif
+
 #define __clear_user		__clear_user
 
 #include <asm/segment.h>
diff --git a/arch/arc/lib/Makefile b/arch/arc/lib/Makefile
index 30158ae69fd4..87d18f5013dc 100644
--- a/arch/arc/lib/Makefile
+++ b/arch/arc/lib/Makefile
@@ -13,3 +13,5 @@ lib-$(CONFIG_ISA_ARCV2)		+=memcpy-archs-unaligned.o
 else
 lib-$(CONFIG_ISA_ARCV2)		+=memcpy-archs.o
 endif
+
+lib-$(CONFIG_ARC_LACKS_ZOL)	+= uaccess.o
diff --git a/arch/arc/lib/uaccess.S b/arch/arc/lib/uaccess.S
new file mode 100644
index 000000000000..5093160a72d3
--- /dev/null
+++ b/arch/arc/lib/uaccess.S
@@ -0,0 +1,144 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * uaccess for ARCv3: avoids ZOL, uses 64-bit memory ops
+ *   ASSUMES unaligned access
+ */
+
+#include <linux/linkage.h>
+#include <asm/assembler.h>
+
+#ifndef CONFIG_ARC_USE_UNALIGNED_MEM_ACCESS
+#error "Unaligned access support needed"
+#endif
+
+; Input
+;  - r0: dest, kernel
+;  - r1: src, user
+;  - r2: sz
+; Output
+;  - r0: Num bytes left to copy, 0 on success
+
+ENTRY_CFI(raw_copy_from_user)
+
+	add    r8, r0, r2
+
+	lsr.f  r3, r2, 4
+	bz     .L1dobytes
+
+	; chunks of 16 bytes
+10:	LD64.ab r4, r1, 8
+11:	LD64.ab r6, r1, 8
+	ST64.ab r4, r0, 8
+	ST64.ab r6, r0, 8
+	DBNZR  r3, 10b
+
+.L1dobytes:
+	; last 1-15 bytes
+	and.f  r3, r2, 0xf
+	bz     .L1done
+
+12:	ldb.ab r4, [r1, 1]
+	stb.ab r4, [r0, 1]
+	DBNZR  r3, 12b
+
+.L1done:
+	; bytes not copied = orig_src + sz - curr_src
+	j.d    [blink]
+	sub    r0, r8, r0
+END_CFI(raw_copy_from_user)
+
+.section __ex_table, "a"
+	.word 10b, .L1done
+	.word 11b, .L1done
+	.word 12b, .L1done
+.previous
+
+; Input
+;  - r0: dest, user
+;  - r1: src, kernel
+;  - r2: sz
+; Output
+;  - r0: Num bytes left to copy, 0 on success
+
+ENTRY_CFI(raw_copy_to_user)
+
+	add    r8, r1, r2
+
+	lsr.f  r3, r2, 4
+	bz     .L2dobytes
+
+	; chunks of 16 bytes
+2:	LD64.ab r4, r1, 8
+	LD64.ab r6, r1, 8
+20:	ST64.ab r4, r0, 8
+21:	ST64.ab r6, r0, 8
+	DBNZR  r3, 2b
+
+.L2dobytes:
+	; last 1-15 bytes
+	and.f  r3, r2, 0xf
+	bz     .L2done
+
+2:	ldb.ab r4, [r1, 1]
+22:	stb.ab r4, [r0, 1]
+	DBNZR  r3, 2b
+
+.L2done:
+	; bytes not copied = orig_src + sz - curr_src
+	j.d    [blink]
+	sub    r0, r8, r1
+
+END_CFI(raw_copy_to_user)
+
+.section __ex_table, "a"
+	.word 20b, .L2done
+	.word 21b, .L2done
+	.word 22b, .L2done
+.previous
+
+ENTRY_CFI(__clear_user)
+	add    r8, r0, r1
+
+	mov    r4, 0
+	mov    r5, 0
+
+	lsr.f  r3, r1, 4
+	bz     .L3dobytes
+
+	; chunks of 16 bytes
+30:	ST64.ab r4, r0, 8
+31:	ST64.ab r4, r0, 8
+	DBNZR  r3, 30b
+
+.L3dobytes:
+	; last 1-15 bytes
+	and.f  r3, r1, 0xf
+	bz     .L3done
+
+32:	stb.ab r4, [r0, 1]
+	DBNZR  r3, 32b
+
+.L3done:
+	; bytes not copied = orig_src + sz - curr_src
+	j.d    [blink]
+	sub    r0, r8, r0
+
+END_CFI(__clear_user)
+
+; Note that .fixup section is missing and that is not an omission
+;
+; .fixup is a level of indirection for user fault handling to do some extra work
+; before jumping off to a safe instruction (past the faulting LD/ST) in uaccess
+; code. This could be say setting up -EFAULT in return register for caller.
+; But if that is not needed (such as above where number of bytes copied/not-copied
+; is already in return reg r0) and fault handler only needs to resume to a valid PC
+; that label could be placed in __ex_table entry (otherwise be in .fixup)
+; do_page_fault() -> fixup_exception() use that to setup pt_regs->ret, which the
+; CPU exception handler resumes to. This also makes the handling more efficient
+; by removing a level of indirection.
+
+.section __ex_table, "a"
+	.word 30b, .L3done
+	.word 31b, .L3done
+	.word 32b, .L3done
+.previous
-- 
2.25.1


_______________________________________________
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [RFC PATCH 05/13] ARCv2: memset: don't prefetch for len == 0 which happens a lot
  2022-02-22 14:14 [RFC PATCH 00/13] ARC: handle the lack of ZOL support Sergey Matyukevich
                   ` (3 preceding siblings ...)
  2022-02-22 14:14 ` [RFC PATCH 04/13] ARC: uaccess: elide ZOL, use double load/stores Sergey Matyukevich
@ 2022-02-22 14:14 ` Sergey Matyukevich
  2022-02-22 14:14 ` [RFC PATCH 06/13] ARCv2: memset: elide unaligned handling if hardware supports Sergey Matyukevich
                   ` (8 subsequent siblings)
  13 siblings, 0 replies; 17+ messages in thread
From: Sergey Matyukevich @ 2022-02-22 14:14 UTC (permalink / raw)
  To: linux-snps-arc
  Cc: Vineet Gupta, Vladimir Isaev, Sergey Matyukevich, Sergey Matyukevich

From: Vineet Gupta <vgupta@kernel.org>

This avoids potential "bleeding" when size == 0 as cache line
would be dirtied (and possibly fetched from other cores) and
due to the same reasons more optimal too.

Signed-off-by: Vineet Gupta <vgupta@kernel.org>
---
 arch/arc/lib/memset-archs.S | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/arc/lib/memset-archs.S b/arch/arc/lib/memset-archs.S
index d2e09fece5bc..d0a5cec4cdca 100644
--- a/arch/arc/lib/memset-archs.S
+++ b/arch/arc/lib/memset-archs.S
@@ -36,12 +36,13 @@
 #endif
 
 ENTRY_CFI(memset)
-	PREFETCHW_INSTR	r0, 0	; Prefetch the first write location
 	mov.f	0, r2
 ;;; if size is zero
 	jz.d	[blink]
 	mov	r3, r0		; don't clobber ret val
 
+	PREFETCHW_INSTR	r0, 0	; Prefetch the first write location
+
 ;;; if length < 8
 	brls.d.nt	r2, 8, .Lsmallchunk
 	mov.f	lp_count,r2
-- 
2.25.1


_______________________________________________
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [RFC PATCH 06/13] ARCv2: memset: elide unaligned handling if hardware supports
  2022-02-22 14:14 [RFC PATCH 00/13] ARC: handle the lack of ZOL support Sergey Matyukevich
                   ` (4 preceding siblings ...)
  2022-02-22 14:14 ` [RFC PATCH 05/13] ARCv2: memset: don't prefetch for len == 0 which happens a lot Sergey Matyukevich
@ 2022-02-22 14:14 ` Sergey Matyukevich
  2022-02-22 14:15 ` [RFC PATCH 07/13] ARCv2: memset: rewrite using double load/stores Sergey Matyukevich
                   ` (7 subsequent siblings)
  13 siblings, 0 replies; 17+ messages in thread
From: Sergey Matyukevich @ 2022-02-22 14:14 UTC (permalink / raw)
  To: linux-snps-arc
  Cc: Vineet Gupta, Vladimir Isaev, Sergey Matyukevich, Sergey Matyukevich

From: Vineet Gupta <vgupta@kernel.org>

The only functional change is eliding the unaligned buffer head
handling. Also cleanup macros adding argument default values.

Signed-off-by: Vineet Gupta <vgupta@kernel.org>
---
 arch/arc/lib/memset-archs.S | 34 +++++++++++++++++-----------------
 1 file changed, 17 insertions(+), 17 deletions(-)

diff --git a/arch/arc/lib/memset-archs.S b/arch/arc/lib/memset-archs.S
index d0a5cec4cdca..330e22f7cf3c 100644
--- a/arch/arc/lib/memset-archs.S
+++ b/arch/arc/lib/memset-archs.S
@@ -17,43 +17,43 @@
 
 #if L1_CACHE_SHIFT == 6
 
-.macro PREALLOC_INSTR	reg, off
-	prealloc	[\reg, \off]
+.macro PREALLOCR s, off=0
+	prealloc [\s, \off]
 .endm
 
-.macro PREFETCHW_INSTR	reg, off
-	prefetchw	[\reg, \off]
+.macro PREFETCHWR s, off=0
+	prefetchw [\s, \off]
 .endm
 
 #else
 
-.macro PREALLOC_INSTR	reg, off
+.macro PREALLOCR s, off=0
 .endm
 
-.macro PREFETCHW_INSTR	reg, off
+.macro PREFETCHWR s, off=0
 .endm
 
 #endif
 
 ENTRY_CFI(memset)
+	; return if size 0 (happens lot)
 	mov.f	0, r2
-;;; if size is zero
 	jz.d	[blink]
-	mov	r3, r0		; don't clobber ret val
+	mov	r3, r0	; make a copy of input pointer reg
 
-	PREFETCHW_INSTR	r0, 0	; Prefetch the first write location
+	PREFETCHWR r0
 
-;;; if length < 8
-	brls.d.nt	r2, 8, .Lsmallchunk
-	mov.f	lp_count,r2
+	; small 1-8 byte handled in tail byte loop :-)
+	brlo	r2, 8, .Lbyteloop
 
-	and.f	r4, r0, 0x03
-	rsub	lp_count, r4, 4
-	lpnz	@.Laligndestination
-	;; LOOP BEGIN
+#ifndef CONFIG_ARC_USE_UNALIGNED_MEM_ACCESS
+	; handle any starting unaligned bytes (upto 3)
+	and.f	lp_count, r0, 0x3
+	lpnz	1f
 	stb.ab	r1, [r3,1]
 	sub	r2, r2, 1
-.Laligndestination:
+1:
+#endif
 
 ;;; Destination is aligned
 	and	r1, r1, 0xFF
-- 
2.25.1


_______________________________________________
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [RFC PATCH 07/13] ARCv2: memset: rewrite using double load/stores
  2022-02-22 14:14 [RFC PATCH 00/13] ARC: handle the lack of ZOL support Sergey Matyukevich
                   ` (5 preceding siblings ...)
  2022-02-22 14:14 ` [RFC PATCH 06/13] ARCv2: memset: elide unaligned handling if hardware supports Sergey Matyukevich
@ 2022-02-22 14:15 ` Sergey Matyukevich
  2022-02-22 14:15 ` [RFC PATCH 08/13] ARC: string: use generic C code if no ZOL support Sergey Matyukevich
                   ` (6 subsequent siblings)
  13 siblings, 0 replies; 17+ messages in thread
From: Sergey Matyukevich @ 2022-02-22 14:15 UTC (permalink / raw)
  To: linux-snps-arc
  Cc: Vineet Gupta, Vladimir Isaev, Sergey Matyukevich, Sergey Matyukevich

From: Vineet Gupta <vgupta@kernel.org>

Signed-off-by: Vineet Gupta <vgupta@kernel.org>
---
 arch/arc/lib/memset-archs.S | 112 ++++++++++++++----------------------
 1 file changed, 43 insertions(+), 69 deletions(-)

diff --git a/arch/arc/lib/memset-archs.S b/arch/arc/lib/memset-archs.S
index 330e22f7cf3c..a9a0ccef761d 100644
--- a/arch/arc/lib/memset-archs.S
+++ b/arch/arc/lib/memset-archs.S
@@ -5,6 +5,7 @@
 
 #include <linux/linkage.h>
 #include <asm/cache.h>
+#include <asm/assembler.h>
 
 /*
  * The memset implementation below is optimized to use prefetchw and prealloc
@@ -55,7 +56,7 @@ ENTRY_CFI(memset)
 1:
 #endif
 
-;;; Destination is aligned
+	; promote memset pattern from char to int (double actually for STD)
 	and	r1, r1, 0xFF
 	asl	r4, r1, 8
 	or	r4, r4, r1
@@ -63,75 +64,48 @@ ENTRY_CFI(memset)
 	or	r5, r5, r4
 	mov	r4, r5
 
-	sub3	lp_count, r2, 8
-	cmp     r2, 64
-	bmsk.hi	r2, r2, 5
-	mov.ls	lp_count, 0
-	add3.hi	r2, r2, 8
-
-;;; Convert len to Dwords, unfold x8
-	lsr.f	lp_count, lp_count, 6
-
-	lpnz	@.Lset64bytes
-	;; LOOP START
-	PREALLOC_INSTR	r3, 64	; alloc next line w/o fetching
-
-#ifdef CONFIG_ARC_HAS_LL64
-	std.ab	r4, [r3, 8]
-	std.ab	r4, [r3, 8]
-	std.ab	r4, [r3, 8]
-	std.ab	r4, [r3, 8]
-	std.ab	r4, [r3, 8]
-	std.ab	r4, [r3, 8]
-	std.ab	r4, [r3, 8]
-	std.ab	r4, [r3, 8]
-#else
-	st.ab	r4, [r3, 4]
-	st.ab	r4, [r3, 4]
-	st.ab	r4, [r3, 4]
-	st.ab	r4, [r3, 4]
-	st.ab	r4, [r3, 4]
-	st.ab	r4, [r3, 4]
-	st.ab	r4, [r3, 4]
-	st.ab	r4, [r3, 4]
-	st.ab	r4, [r3, 4]
-	st.ab	r4, [r3, 4]
-	st.ab	r4, [r3, 4]
-	st.ab	r4, [r3, 4]
-	st.ab	r4, [r3, 4]
-	st.ab	r4, [r3, 4]
-	st.ab	r4, [r3, 4]
-	st.ab	r4, [r3, 4]
-#endif
-.Lset64bytes:
-
-	lsr.f	lp_count, r2, 5 ;Last remaining  max 124 bytes
-	lpnz	.Lset32bytes
-	;; LOOP START
-#ifdef CONFIG_ARC_HAS_LL64
-	std.ab	r4, [r3, 8]
-	std.ab	r4, [r3, 8]
-	std.ab	r4, [r3, 8]
-	std.ab	r4, [r3, 8]
-#else
-	st.ab	r4, [r3, 4]
-	st.ab	r4, [r3, 4]
-	st.ab	r4, [r3, 4]
-	st.ab	r4, [r3, 4]
-	st.ab	r4, [r3, 4]
-	st.ab	r4, [r3, 4]
-	st.ab	r4, [r3, 4]
-	st.ab	r4, [r3, 4]
-#endif
-.Lset32bytes:
-
-	and.f	lp_count, r2, 0x1F ;Last remaining 31 bytes
-.Lsmallchunk:
-	lpnz	.Lcopy3bytes
-	;; LOOP START
+	; Loop #a:
+	; - Updates 1 cache line worth data (64 bytes) per iteration
+	; - PREALLOC the next line.
+	;
+	; = Only entered if at least 2 lines worth of work (i.e. >= 128 bytes),
+	;   else PREALLOC for next can "bleed" past end of buffer, causing data
+	;   corruption issue if that line is owned by some other core.
+	; = Last 64 bytes (even for min 128 bytes work) are NOT done here to
+	;   avoid PREALLOC issue
+
+	sub     r6, r2, 64
+	cmp	r2, 64
+	bmsk.hi	r2, r2, 5	; trailing 63 bytes
+	mov.ls	r6, 0
+	add.hi	r2, r2, 64	; line skipped in loop below
+
+	lsr.f	lp_count, r6, 6
+	lpnz	2f
+	PREALLOCR r3, 64
+	ST64.ab	r4, r3, 8
+	ST64.ab	r4, r3, 8
+	ST64.ab	r4, r3, 8
+	ST64.ab	r4, r3, 8
+	ST64.ab	r4, r3, 8
+	ST64.ab	r4, r3, 8
+	ST64.ab	r4, r3, 8
+	ST64.ab	r4, r3, 8
+2:
+	; Loop #b: Remaining 32 / 64 bytes
+	lsr.f	lp_count, r2, 5
+	lpnz	.Lbyteloop
+	ST64.ab	r4, r3, 8
+	ST64.ab	r4, r3, 8
+	ST64.ab	r4, r3, 8
+	ST64.ab	r4, r3, 8
+
+.Lbyteloop:
+	; Loop #c: straggler 31 bytes
+	and.f	lp_count, r2, 0x1F
+	lpnz	4f
 	stb.ab	r1, [r3, 1]
-.Lcopy3bytes:
-
+4:
 	j	[blink]
 
 END_CFI(memset)
-- 
2.25.1


_______________________________________________
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [RFC PATCH 08/13] ARC: string: use generic C code if no ZOL support
  2022-02-22 14:14 [RFC PATCH 00/13] ARC: handle the lack of ZOL support Sergey Matyukevich
                   ` (6 preceding siblings ...)
  2022-02-22 14:15 ` [RFC PATCH 07/13] ARCv2: memset: rewrite using double load/stores Sergey Matyukevich
@ 2022-02-22 14:15 ` Sergey Matyukevich
  2022-02-22 14:15 ` [RFC PATCH 09/13] ARC: delay: elide ZOL Sergey Matyukevich
                   ` (5 subsequent siblings)
  13 siblings, 0 replies; 17+ messages in thread
From: Sergey Matyukevich @ 2022-02-22 14:15 UTC (permalink / raw)
  To: linux-snps-arc
  Cc: Vineet Gupta, Vladimir Isaev, Sergey Matyukevich, Sergey Matyukevich

From: Vineet Gupta <vgupta@kernel.org>

Switch to generic C code when ZOL is not supported.
Generic code lacks memzero, so define it.

Signed-off-by: Vineet Gupta <vgupta@kernel.org>
---
 arch/arc/include/asm/string.h | 15 ++++++++++++++-
 arch/arc/kernel/arcksyms.c    |  2 ++
 arch/arc/lib/Makefile         |  4 ++++
 3 files changed, 20 insertions(+), 1 deletion(-)

diff --git a/arch/arc/include/asm/string.h b/arch/arc/include/asm/string.h
index 3182ea9dcdde..5cde5226fada 100644
--- a/arch/arc/include/asm/string.h
+++ b/arch/arc/include/asm/string.h
@@ -14,6 +14,8 @@
 
 #include <linux/types.h>
 
+#ifndef CONFIG_ARC_LACKS_ZOL
+
 #define __HAVE_ARCH_MEMSET
 #define __HAVE_ARCH_MEMCPY
 #define __HAVE_ARCH_MEMCMP
@@ -22,7 +24,7 @@
 #define __HAVE_ARCH_STRCMP
 #define __HAVE_ARCH_STRLEN
 
-extern void *memset(void *ptr, int, __kernel_size_t);
+extern void *memset(void *, int, __kernel_size_t);
 extern void *memcpy(void *, const void *, __kernel_size_t);
 extern void memzero(void *ptr, __kernel_size_t n);
 extern int memcmp(const void *, const void *, __kernel_size_t);
@@ -31,4 +33,15 @@ extern char *strcpy(char *dest, const char *src);
 extern int strcmp(const char *cs, const char *ct);
 extern __kernel_size_t strlen(const char *);
 
+#else
+
+extern void *memset(void *, int, __kernel_size_t);
+
+static inline void memzero(void *s, size_t count)
+{
+	memset(s, 0, count);
+}
+
+#endif
+
 #endif /* _ASM_ARC_STRING_H */
diff --git a/arch/arc/kernel/arcksyms.c b/arch/arc/kernel/arcksyms.c
index 8851c0a19e09..d682cea639a4 100644
--- a/arch/arc/kernel/arcksyms.c
+++ b/arch/arc/kernel/arcksyms.c
@@ -45,6 +45,7 @@ EXPORT_SYMBOL(__floatunsisf);
 EXPORT_SYMBOL(__udivdi3);
 
 /* ARC optimised assembler routines */
+#ifndef CONFIG_ARC_LACKS_ZOL
 EXPORT_SYMBOL(memset);
 EXPORT_SYMBOL(memcpy);
 EXPORT_SYMBOL(memcmp);
@@ -52,3 +53,4 @@ EXPORT_SYMBOL(strchr);
 EXPORT_SYMBOL(strcpy);
 EXPORT_SYMBOL(strcmp);
 EXPORT_SYMBOL(strlen);
+#endif
diff --git a/arch/arc/lib/Makefile b/arch/arc/lib/Makefile
index 87d18f5013dc..28793e1ad1be 100644
--- a/arch/arc/lib/Makefile
+++ b/arch/arc/lib/Makefile
@@ -3,6 +3,8 @@
 # Copyright (C) 2004, 2007-2010, 2011-2012 Synopsys, Inc. (www.synopsys.com)
 #
 
+ifndef CONFIG_ARC_LACKS_ZOL
+
 lib-y	:= strchr-700.o strcpy-700.o strlen.o memcmp.o
 
 lib-$(CONFIG_ISA_ARCOMPACT)	+= memcpy-700.o memset.o strcmp.o
@@ -14,4 +16,6 @@ else
 lib-$(CONFIG_ISA_ARCV2)		+=memcpy-archs.o
 endif
 
+endif
+
 lib-$(CONFIG_ARC_LACKS_ZOL)	+= uaccess.o
-- 
2.25.1


_______________________________________________
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [RFC PATCH 09/13] ARC: delay: elide ZOL
  2022-02-22 14:14 [RFC PATCH 00/13] ARC: handle the lack of ZOL support Sergey Matyukevich
                   ` (7 preceding siblings ...)
  2022-02-22 14:15 ` [RFC PATCH 08/13] ARC: string: use generic C code if no ZOL support Sergey Matyukevich
@ 2022-02-22 14:15 ` Sergey Matyukevich
  2022-02-22 14:15 ` [RFC PATCH 10/13] ARC: checksum: " Sergey Matyukevich
                   ` (4 subsequent siblings)
  13 siblings, 0 replies; 17+ messages in thread
From: Sergey Matyukevich @ 2022-02-22 14:15 UTC (permalink / raw)
  To: linux-snps-arc
  Cc: Vineet Gupta, Vladimir Isaev, Sergey Matyukevich, Sergey Matyukevich

From: Vineet Gupta <vgupta@kernel.org>

Add __delay implementation based on DBNZ if ZOL is not supported.

Signed-off-by: Vineet Gupta <vgupta@kernel.org>
---
 arch/arc/include/asm/delay.h | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/arch/arc/include/asm/delay.h b/arch/arc/include/asm/delay.h
index 54db798f0aa0..e061d1c64f24 100644
--- a/arch/arc/include/asm/delay.h
+++ b/arch/arc/include/asm/delay.h
@@ -16,9 +16,12 @@
 
 #include <asm-generic/types.h>
 #include <asm/param.h>		/* HZ */
+#include <asm/assembler.h>
 
 extern unsigned long loops_per_jiffy;
 
+#ifndef CONFIG_ARC_LACKS_ZOL
+
 static inline void __delay(unsigned long loops)
 {
 	__asm__ __volatile__(
@@ -31,6 +34,19 @@ static inline void __delay(unsigned long loops)
         : "lp_count");
 }
 
+#else
+
+static inline void __delay(unsigned long loops)
+{
+	__asm__ __volatile__(
+	"	add   %0, %0, 1         \n"
+	"1:	nop			\n"
+	"	DBNZR %0, 1b		\n"
+	: "+r"(loops));
+}
+
+#endif
+
 extern void __bad_udelay(void);
 
 /*
-- 
2.25.1


_______________________________________________
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [RFC PATCH 10/13] ARC: checksum: elide ZOL
  2022-02-22 14:14 [RFC PATCH 00/13] ARC: handle the lack of ZOL support Sergey Matyukevich
                   ` (8 preceding siblings ...)
  2022-02-22 14:15 ` [RFC PATCH 09/13] ARC: delay: elide ZOL Sergey Matyukevich
@ 2022-02-22 14:15 ` Sergey Matyukevich
  2022-02-22 14:15 ` [RFC PATCH 11/13] ARC: head: " Sergey Matyukevich
                   ` (3 subsequent siblings)
  13 siblings, 0 replies; 17+ messages in thread
From: Sergey Matyukevich @ 2022-02-22 14:15 UTC (permalink / raw)
  To: linux-snps-arc
  Cc: Vineet Gupta, Vladimir Isaev, Sergey Matyukevich, Sergey Matyukevich

From: Vineet Gupta <vgupta@kernel.org>

Add checksum implementation based on double load/stores
if ZOL is not supported.

Signed-off-by: Vineet Gupta <vgupta@kernel.org>
---
 arch/arc/include/asm/checksum.h | 58 ++++++++++++++++++++++++++++++---
 1 file changed, 53 insertions(+), 5 deletions(-)

diff --git a/arch/arc/include/asm/checksum.h b/arch/arc/include/asm/checksum.h
index 0b485800a392..435017be9900 100644
--- a/arch/arc/include/asm/checksum.h
+++ b/arch/arc/include/asm/checksum.h
@@ -29,10 +29,13 @@ static inline __sum16 csum_fold(__wsum s)
 	s -= r;
 	return s >> 16;
 }
+#define csum_fold csum_fold
 
+#ifndef CONFIG_ARC_LACKS_ZOL
 /*
- *	This is a version of ip_compute_csum() optimized for IP headers,
- *	which always checksum on 4 octet boundaries.
+ * This is a version of ip_compute_csum() optimized for IP headers,
+ * which always checksum on 4 octet boundaries.
+ * @ihl comes from IP hdr and is number of 4-byte words
  */
 static inline __sum16
 ip_fast_csum(const void *iph, unsigned int ihl)
@@ -62,6 +65,54 @@ ip_fast_csum(const void *iph, unsigned int ihl)
 	return csum_fold(sum);
 }
 
+#else
+
+/*
+ * This is a version of ip_compute_csum() optimized for IP headers,
+ * which always checksum on 4 octet boundaries.
+ * @ihl comes from IP hdr and is number of 4-byte words
+ *  - No loop enterted for canonical 5 words
+ *  - optimized for ARCv2
+ *    - LDL double load for fetching first 16 bytes
+ *    - DBNZ instruction for looping (ZOL not used)
+ */
+static inline __sum16
+ip_fast_csum(const void *iph, unsigned int ihl)
+{
+	unsigned int tmp, sum;
+	u64 dw1, dw2;
+
+	__asm__(
+#ifdef CONFIG_ARC_HAS_LL64
+	"	ldd.ab %0, [%4, 8]	\n"
+	"	ldd.ab %1, [%4, 8]	\n"
+#else
+	"	ld.ab %L0, [%4, 4]	\n"
+	"	ld.ab %H0, [%4, 4]	\n"
+	"	ld.ab %L1, [%4, 4]	\n"
+	"	ld.ab %H1, [%4, 4]	\n"
+#endif
+	"	sub    %5, %5,  4	\n"
+	"	add.f  %3, %L0, %H0	\n"
+	"	adc.f  %3, %3,  %L1	\n"
+	"	adc.f  %3, %3,  %H1	\n"
+	"1:	ld.ab  %2, [%4, 4]	\n"
+	"	adc.f  %3, %3,  %2	\n"
+	"	DBNZR  %5, 1b		\n"
+	"	add.cs %3, %3,  1	\n"
+
+	: "=&r" (dw1), "=&r" (dw2), "=&r" (tmp), "=&r" (sum),
+	  "+&r" (iph), "+&r"(ihl)
+	:
+	: "cc", "memory");
+
+	return csum_fold(sum);
+}
+
+#endif
+
+#define ip_fast_csum ip_fast_csum
+
 /*
  * TCP pseudo Header is 12 bytes:
  * SA [4], DA [4], zeroes [1], Proto[1], TCP Seg(hdr+data) Len [2]
@@ -88,9 +139,6 @@ csum_tcpudp_nofold(__be32 saddr, __be32 daddr, __u32 len,
 
 	return sum;
 }
-
-#define csum_fold csum_fold
-#define ip_fast_csum ip_fast_csum
 #define csum_tcpudp_nofold csum_tcpudp_nofold
 
 #include <asm-generic/checksum.h>
-- 
2.25.1


_______________________________________________
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [RFC PATCH 11/13] ARC: head: elide ZOL
  2022-02-22 14:14 [RFC PATCH 00/13] ARC: handle the lack of ZOL support Sergey Matyukevich
                   ` (9 preceding siblings ...)
  2022-02-22 14:15 ` [RFC PATCH 10/13] ARC: checksum: " Sergey Matyukevich
@ 2022-02-22 14:15 ` Sergey Matyukevich
  2022-02-22 14:15 ` [RFC PATCH 12/13] ARC: build: inhibit ZOL generation by compiler Sergey Matyukevich
                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 17+ messages in thread
From: Sergey Matyukevich @ 2022-02-22 14:15 UTC (permalink / raw)
  To: linux-snps-arc
  Cc: Vineet Gupta, Vladimir Isaev, Sergey Matyukevich, Sergey Matyukevich

From: Vineet Gupta <vgupta@kernel.org>

Add entry implementation based on double load/stores
if ZOL is not supported.

Signed-off-by: Vineet Gupta <vgupta@kernel.org>
---
 arch/arc/include/asm/asm-macro-ll64-emul.h | 3 +++
 arch/arc/include/asm/entry.h               | 2 ++
 arch/arc/kernel/head.S                     | 8 +++++++-
 arch/arc/kernel/vmlinux.lds.S              | 2 +-
 4 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/arch/arc/include/asm/asm-macro-ll64-emul.h b/arch/arc/include/asm/asm-macro-ll64-emul.h
index 886320cc74ad..417c892d557e 100644
--- a/arch/arc/include/asm/asm-macro-ll64-emul.h
+++ b/arch/arc/include/asm/asm-macro-ll64-emul.h
@@ -15,6 +15,9 @@
 	.ifeqs	"\d", "r6"
 		st.ab	r7, [\s, \incr / 2]
 	.endif
+	.ifeqs	"\d", "0"
+		st.ab	\d, [\s, \incr / 2]
+	.endif
 .endm
 
 .macro LD64.ab d, s, incr
diff --git a/arch/arc/include/asm/entry.h b/arch/arc/include/asm/entry.h
index fcdd59d77f42..1bc9f730e1e2 100644
--- a/arch/arc/include/asm/entry.h
+++ b/arch/arc/include/asm/entry.h
@@ -7,6 +7,8 @@
 #ifndef __ASM_ARC_ENTRY_H
 #define __ASM_ARC_ENTRY_H
 
+#include <asm/asm-offsets.h>
+#include <asm/assembler.h>
 #include <asm/unistd.h>		/* For NR_syscalls defination */
 #include <asm/arcregs.h>
 #include <asm/ptrace.h>
diff --git a/arch/arc/kernel/head.S b/arch/arc/kernel/head.S
index 9152782444b5..17b5426d4ca4 100644
--- a/arch/arc/kernel/head.S
+++ b/arch/arc/kernel/head.S
@@ -121,13 +121,19 @@ ENTRY(stext)
 #endif
 
 	; Clear BSS before updating any globals
-	; XXX: use ZOL here
 	mov	r5, __bss_start
 	sub	r6, __bss_stop, r5
+#ifndef CONFIG_ARC_LACKS_ZOL
 	lsr.f	lp_count, r6, 2
 	lpnz	1f
 	st.ab   0, [r5, 4]
 1:
+#else
+	lsr	r6, r6, 3
+1:
+	ST64.ab	0, r5, 8
+	DBNZR	r6, 1b
+#endif
 
 	; Uboot - kernel ABI
 	;    r0 = [0] No uboot interaction, [1] cmdline in r2, [2] DTB in r2
diff --git a/arch/arc/kernel/vmlinux.lds.S b/arch/arc/kernel/vmlinux.lds.S
index 529ae50f9fe2..00aeb89bd169 100644
--- a/arch/arc/kernel/vmlinux.lds.S
+++ b/arch/arc/kernel/vmlinux.lds.S
@@ -107,7 +107,7 @@ SECTIONS
 
 	_edata = .;
 
-	BSS_SECTION(4, 4, 4)
+	BSS_SECTION(8, 8, 8)
 
 #ifdef CONFIG_ARC_DW2_UNWIND
 	. = ALIGN(PAGE_SIZE);
-- 
2.25.1


_______________________________________________
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [RFC PATCH 12/13] ARC: build: inhibit ZOL generation by compiler
  2022-02-22 14:14 [RFC PATCH 00/13] ARC: handle the lack of ZOL support Sergey Matyukevich
                   ` (10 preceding siblings ...)
  2022-02-22 14:15 ` [RFC PATCH 11/13] ARC: head: " Sergey Matyukevich
@ 2022-02-22 14:15 ` Sergey Matyukevich
  2022-02-22 14:15 ` [RFC PATCH 13/13] ARC: pt_regs: handle the case when ZOL is not supported Sergey Matyukevich
  2022-02-28  2:09 ` [RFC PATCH 00/13] ARC: handle the lack of ZOL support Vineet Gupta
  13 siblings, 0 replies; 17+ messages in thread
From: Sergey Matyukevich @ 2022-02-22 14:15 UTC (permalink / raw)
  To: linux-snps-arc
  Cc: Vineet Gupta, Vladimir Isaev, Sergey Matyukevich, Sergey Matyukevich

From: Vineet Gupta <vgupta@kernel.org>

Inhibit ZOL generation by compiler if configuration states
the lack of it. This is done before we remove the ZOL regs
save/restore in entry code

Signed-off-by: Vineet Gupta <vgupta@kernel.org>
---
 arch/arc/Makefile | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/arc/Makefile b/arch/arc/Makefile
index ec0f672bcee6..98a03c8d719c 100644
--- a/arch/arc/Makefile
+++ b/arch/arc/Makefile
@@ -44,6 +44,8 @@ cflags-y				+= -Wa,-I$(srctree)/arch/arc/include
 cflags-$(CONFIG_ARC_HAS_LLSC)		+= -mlock
 cflags-$(CONFIG_ARC_HAS_SWAPE)		+= -mswape
 
+cflags-$(CONFIG_ARC_LACKS_ZOL)		+= -fno-branch-count-reg
+
 ifdef CONFIG_ISA_ARCV2
 
 ifdef CONFIG_ARC_USE_UNALIGNED_MEM_ACCESS
-- 
2.25.1


_______________________________________________
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [RFC PATCH 13/13] ARC: pt_regs: handle the case when ZOL is not supported
  2022-02-22 14:14 [RFC PATCH 00/13] ARC: handle the lack of ZOL support Sergey Matyukevich
                   ` (11 preceding siblings ...)
  2022-02-22 14:15 ` [RFC PATCH 12/13] ARC: build: inhibit ZOL generation by compiler Sergey Matyukevich
@ 2022-02-22 14:15 ` Sergey Matyukevich
  2022-02-28  2:09 ` [RFC PATCH 00/13] ARC: handle the lack of ZOL support Vineet Gupta
  13 siblings, 0 replies; 17+ messages in thread
From: Sergey Matyukevich @ 2022-02-22 14:15 UTC (permalink / raw)
  To: linux-snps-arc
  Cc: Vineet Gupta, Vladimir Isaev, Sergey Matyukevich, Sergey Matyukevich

From: Vineet Gupta <vgupta@kernel.org>

- Entry code (interrupts/exceptions) need not save/restore ZOL regs
- Any userspace ZOL references (ptrace, signal frame, process start)
  reworked such that ZOL regs are Zero-on-read, Ingore-on-write since
  the ptrace ABI need not change

Signed-off-by: Vineet Gupta <vgupta@kernel.org>
---
 arch/arc/include/asm/entry-arcv2.h |  4 ++++
 arch/arc/include/asm/ptrace.h      |  4 +++-
 arch/arc/kernel/asm-offsets.c      |  2 ++
 arch/arc/kernel/disasm.c           |  2 ++
 arch/arc/kernel/intc-arcv2.c       |  2 ++
 arch/arc/kernel/kgdb.c             |  4 ++++
 arch/arc/kernel/process.c          |  2 ++
 arch/arc/kernel/ptrace.c           | 12 ++++++++++++
 arch/arc/kernel/signal.c           |  8 ++++++++
 arch/arc/kernel/troubleshoot.c     |  3 +++
 arch/arc/kernel/unaligned.c        |  2 ++
 11 files changed, 44 insertions(+), 1 deletion(-)

diff --git a/arch/arc/include/asm/entry-arcv2.h b/arch/arc/include/asm/entry-arcv2.h
index 0ff4c0610561..e40a98d2ec29 100644
--- a/arch/arc/include/asm/entry-arcv2.h
+++ b/arch/arc/include/asm/entry-arcv2.h
@@ -117,11 +117,13 @@
 
 	st	blink, [sp, PT_blink]
 
+#ifndef CONFIG_ARC_LACKS_ZOL
 	lr	r10, [lp_end]
 	lr	r11, [lp_start]
 	ST2	r10, r11, PT_lpe
 
 	st	lp_count, [sp, PT_lpc]
+#endif
 
 	; skip JLI, LDI, EI for now
 .endm
@@ -205,12 +207,14 @@
 
 	ld	blink, [sp, PT_blink]
 
+#ifndef CONFIG_ARC_LACKS_ZOL
 	LD2	r10, r11, PT_lpe
 	sr	r10, [lp_end]
 	sr	r11, [lp_start]
 
 	ld	r10, [sp, PT_lpc]	; lp_count can't be target of LD
 	mov	lp_count, r10
+#endif
 
 	LD2	r0,  r1,  PT_r0
 	LD2	r2,  r3,  PT_r2
diff --git a/arch/arc/include/asm/ptrace.h b/arch/arc/include/asm/ptrace.h
index cca8d6583e31..9d2b1e7ba6a3 100644
--- a/arch/arc/include/asm/ptrace.h
+++ b/arch/arc/include/asm/ptrace.h
@@ -94,8 +94,10 @@ struct pt_regs {
 	unsigned long r0, r1, r2, r3, r4, r5, r6, r7, r8, r9, r10, r11;
 
 	unsigned long blink;
-	unsigned long lp_end, lp_start, lp_count;
 
+#ifndef CONFIG_ARC_LACKS_ZOL
+	unsigned long lp_end, lp_start, lp_count;
+#endif
 	unsigned long ei, ldi, jli;
 
 	unsigned long ret;
diff --git a/arch/arc/kernel/asm-offsets.c b/arch/arc/kernel/asm-offsets.c
index 0e884036ab74..e388d3420b3d 100644
--- a/arch/arc/kernel/asm-offsets.c
+++ b/arch/arc/kernel/asm-offsets.c
@@ -61,8 +61,10 @@ int main(void)
 	DEFINE(PT_r26, offsetof(struct pt_regs, r26));
 	DEFINE(PT_ret, offsetof(struct pt_regs, ret));
 	DEFINE(PT_blink, offsetof(struct pt_regs, blink));
+#ifndef CONFIG_ARC_LACKS_ZOL
 	DEFINE(PT_lpe, offsetof(struct pt_regs, lp_end));
 	DEFINE(PT_lpc, offsetof(struct pt_regs, lp_count));
+#endif
 	DEFINE(PT_user_r25, offsetof(struct pt_regs, user_r25));
 
 	DEFINE(SZ_CALLEE_REGS, sizeof(struct callee_regs));
diff --git a/arch/arc/kernel/disasm.c b/arch/arc/kernel/disasm.c
index 03f8b1be0c3a..c23d3829aef6 100644
--- a/arch/arc/kernel/disasm.c
+++ b/arch/arc/kernel/disasm.c
@@ -523,11 +523,13 @@ int __kprobes disasm_next_pc(unsigned long pc, struct pt_regs *regs,
 		*next_pc += instr_d.instr_len;
 	 }
 
+#ifndef CONFIG_ARC_LACKS_ZOL
 	 /* Zero Overhead Loop - end of the loop */
 	if (!(regs->status32 & STATUS32_L) && (*next_pc == regs->lp_end)
 		&& (regs->lp_count > 1)) {
 		*next_pc = regs->lp_start;
 	}
+#endif
 
 	return instr.is_branch;
 }
diff --git a/arch/arc/kernel/intc-arcv2.c b/arch/arc/kernel/intc-arcv2.c
index 5cda19d0aa91..7c1d8b2c4dce 100644
--- a/arch/arc/kernel/intc-arcv2.c
+++ b/arch/arc/kernel/intc-arcv2.c
@@ -48,7 +48,9 @@ void arc_init_IRQ(void)
 #ifndef CONFIG_ARC_IRQ_NO_AUTOSAVE
 	ictrl.save_nr_gpr_pairs = 6;	/* r0 to r11 (r12 saved manually) */
 	ictrl.save_blink = 1;
+#ifndef CONFIG_ARC_LACKS_ZOL
 	ictrl.save_lp_regs = 1;		/* LP_COUNT, LP_START, LP_END */
+#endif
 	ictrl.save_u_to_u = 0;		/* user ctxt saved on kernel stack */
 	ictrl.save_idx_regs = 1;	/* JLI, LDI, EI */
 #endif
diff --git a/arch/arc/kernel/kgdb.c b/arch/arc/kernel/kgdb.c
index 345a0000554c..6f237fdc6e54 100644
--- a/arch/arc/kernel/kgdb.c
+++ b/arch/arc/kernel/kgdb.c
@@ -27,9 +27,11 @@ static void to_gdb_regs(unsigned long *gdb_regs, struct pt_regs *kernel_regs,
 	gdb_regs[_BLINK]	= kernel_regs->blink;
 	gdb_regs[_RET]		= kernel_regs->ret;
 	gdb_regs[_STATUS32]	= kernel_regs->status32;
+#ifndef CONFIG_ARC_LACKS_ZOL
 	gdb_regs[_LP_COUNT]	= kernel_regs->lp_count;
 	gdb_regs[_LP_END]	= kernel_regs->lp_end;
 	gdb_regs[_LP_START]	= kernel_regs->lp_start;
+#endif
 	gdb_regs[_BTA]		= kernel_regs->bta;
 	gdb_regs[_STOP_PC]	= kernel_regs->ret;
 }
@@ -47,9 +49,11 @@ static void from_gdb_regs(unsigned long *gdb_regs, struct pt_regs *kernel_regs,
 	kernel_regs->blink	= gdb_regs[_BLINK];
 	kernel_regs->ret	= gdb_regs[_RET];
 	kernel_regs->status32	= gdb_regs[_STATUS32];
+#ifndef CONFIG_ARC_LACKS_ZOL
 	kernel_regs->lp_count	= gdb_regs[_LP_COUNT];
 	kernel_regs->lp_end	= gdb_regs[_LP_END];
 	kernel_regs->lp_start	= gdb_regs[_LP_START];
+#endif
 	kernel_regs->bta	= gdb_regs[_BTA];
 }
 
diff --git a/arch/arc/kernel/process.c b/arch/arc/kernel/process.c
index 8e90052f6f05..2de60b74d462 100644
--- a/arch/arc/kernel/process.c
+++ b/arch/arc/kernel/process.c
@@ -269,9 +269,11 @@ void start_thread(struct pt_regs *regs, unsigned long pc, unsigned long usp)
 
 	fpu_init_task(regs);
 
+#ifndef CONFIG_ARC_LACKS_ZOL
 	/* bogus seed values for debugging */
 	regs->lp_start = 0x10;
 	regs->lp_end = 0x80;
+#endif
 }
 
 /*
diff --git a/arch/arc/kernel/ptrace.c b/arch/arc/kernel/ptrace.c
index 883391977fdf..d3b98c9749e4 100644
--- a/arch/arc/kernel/ptrace.c
+++ b/arch/arc/kernel/ptrace.c
@@ -26,9 +26,15 @@ static int genregs_get(struct task_struct *target,
 
 	membuf_zero(&to, 4);	// pad
 	membuf_store(&to, ptregs->bta);
+#ifndef CONFIG_ARC_LACKS_ZOL
 	membuf_store(&to, ptregs->lp_start);
 	membuf_store(&to, ptregs->lp_end);
 	membuf_store(&to, ptregs->lp_count);
+#else
+	membuf_zero(&to, 4);	// ptregs->lp_start
+	membuf_zero(&to, 4);	// ptregs->lp_end
+	membuf_zero(&to, 4);	// ptregs->lp_count
+#endif
 	membuf_store(&to, ptregs->status32);
 	membuf_store(&to, ptregs->ret);
 	membuf_store(&to, ptregs->blink);
@@ -107,9 +113,15 @@ static int genregs_set(struct task_struct *target,
 	REG_IGNORE_ONE(pad);
 
 	REG_IN_ONE(scratch.bta, &ptregs->bta);
+#ifndef CONFIG_ARC_LACKS_ZOL
 	REG_IN_ONE(scratch.lp_start, &ptregs->lp_start);
 	REG_IN_ONE(scratch.lp_end, &ptregs->lp_end);
 	REG_IN_ONE(scratch.lp_count, &ptregs->lp_count);
+#else
+	REG_IGNORE_ONE(scratch.lp_start);
+	REG_IGNORE_ONE(scratch.lp_end);
+	REG_IGNORE_ONE(scratch.lp_count);
+#endif
 
 	REG_IGNORE_ONE(scratch.status32);
 
diff --git a/arch/arc/kernel/signal.c b/arch/arc/kernel/signal.c
index cb2f88502baf..449a4b0c6453 100644
--- a/arch/arc/kernel/signal.c
+++ b/arch/arc/kernel/signal.c
@@ -104,9 +104,15 @@ stash_usr_regs(struct rt_sigframe __user *sf, struct pt_regs *regs,
 	struct user_regs_struct uregs;
 
 	uregs.scratch.bta	= regs->bta;
+#ifndef CONFIG_ARC_LACKS_ZOL
 	uregs.scratch.lp_start	= regs->lp_start;
 	uregs.scratch.lp_end	= regs->lp_end;
 	uregs.scratch.lp_count	= regs->lp_count;
+#else
+	uregs.scratch.lp_start	= 0;
+	uregs.scratch.lp_end	= 0;
+	uregs.scratch.lp_count	= 0;
+#endif
 	uregs.scratch.status32	= regs->status32;
 	uregs.scratch.ret	= regs->ret;
 	uregs.scratch.blink	= regs->blink;
@@ -157,9 +163,11 @@ static int restore_usr_regs(struct pt_regs *regs, struct rt_sigframe __user *sf)
 
 	set_current_blocked(&set);
 	regs->bta	= uregs.scratch.bta;
+#ifndef CONFIG_ARC_LACKS_ZOL
 	regs->lp_start	= uregs.scratch.lp_start;
 	regs->lp_end	= uregs.scratch.lp_end;
 	regs->lp_count	= uregs.scratch.lp_count;
+#endif
 	regs->status32	= uregs.scratch.status32;
 	regs->ret	= uregs.scratch.ret;
 	regs->blink	= uregs.scratch.blink;
diff --git a/arch/arc/kernel/troubleshoot.c b/arch/arc/kernel/troubleshoot.c
index 7654c2e42dc0..acb7ee6c024d 100644
--- a/arch/arc/kernel/troubleshoot.c
+++ b/arch/arc/kernel/troubleshoot.c
@@ -22,8 +22,11 @@ static noinline void print_regs_scratch(struct pt_regs *regs)
 {
 	pr_cont("BTA: 0x%08lx\n SP: 0x%08lx  FP: 0x%08lx BLK: %pS\n",
 		regs->bta, regs->sp, regs->fp, (void *)regs->blink);
+
+#ifndef CONFIG_ARC_LACKS_ZOL
 	pr_cont("LPS: 0x%08lx\tLPE: 0x%08lx\tLPC: 0x%08lx\n",
 		regs->lp_start, regs->lp_end, regs->lp_count);
+#endif
 
 	pr_info("r00: 0x%08lx\tr01: 0x%08lx\tr02: 0x%08lx\n"	\
 		"r03: 0x%08lx\tr04: 0x%08lx\tr05: 0x%08lx\n"	\
diff --git a/arch/arc/kernel/unaligned.c b/arch/arc/kernel/unaligned.c
index d63ebd81f1c6..0937441bce04 100644
--- a/arch/arc/kernel/unaligned.c
+++ b/arch/arc/kernel/unaligned.c
@@ -244,11 +244,13 @@ int misaligned_fixup(unsigned long address, struct pt_regs *regs,
 	} else {
 		regs->ret += state.instr_len;
 
+#ifndef CONFIG_ARC_LACKS_ZOL
 		/* handle zero-overhead-loop */
 		if ((regs->ret == regs->lp_end) && (regs->lp_count)) {
 			regs->ret = regs->lp_start;
 			regs->lp_count--;
 		}
+#endif
 	}
 
 	perf_sw_event(PERF_COUNT_SW_ALIGNMENT_FAULTS, 1, regs, address);
-- 
2.25.1


_______________________________________________
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [RFC PATCH 00/13] ARC: handle the lack of ZOL support
  2022-02-22 14:14 [RFC PATCH 00/13] ARC: handle the lack of ZOL support Sergey Matyukevich
                   ` (12 preceding siblings ...)
  2022-02-22 14:15 ` [RFC PATCH 13/13] ARC: pt_regs: handle the case when ZOL is not supported Sergey Matyukevich
@ 2022-02-28  2:09 ` Vineet Gupta
  2022-03-03 19:22   ` Sergey Matyukevich
  2022-03-23 10:09   ` [RFC PATCH 00/13] ARC: handle the lack of ZOL supporty Sergey Matyukevich
  13 siblings, 2 replies; 17+ messages in thread
From: Vineet Gupta @ 2022-02-28  2:09 UTC (permalink / raw)
  To: Sergey Matyukevich, linux-snps-arc
  Cc: Vineet Gupta, Vladimir Isaev, Sergey Matyukevich

Thx for doing this. I think the series mixes a few things not related to 
ZOL removal - the changelog for removal of -Os specific code seems 
incorrect etc.
Let me repost with slight more cleanups.

-Vineet

On 2/22/22 06:14, Sergey Matyukevich wrote:
> From: Sergey Matyukevich <sergey.matyukevich@synopsys.com>
>
> Hi Vineet and all,
>
> This patch series continues to prepare arch/arc for the upcoming ARCv3
> support. ARCv3 does not support zero-overhead-loop (ZOL). So this patch
> series provides a set of changes that make ZOL support optional.
>
> The patch series is based on top of Linux 5.17-rc5. It has been tested
> with enabled CONFIG_ARC_LACKS_ZOL option on ARCv2 HSDK hardware as well
> as on nSIM simulator for ARCv2.
>
> I fixed typos, updated Vineet's email address, and slightly modified
> several commit messages. Otherwise this patch series is the first chunk
> of ARCv3 bring-up changes by Vineet, available at Synopsys github: see
> github.com/foss-for-synopsys-dwc-arc-processors/linux
>
> Regards,
> Sergey
>
> Vineet Gupta (13):
>    ARC: uaccess: elide unaliged handling if hardware supports
>    ARC: Kconfig: introduce option to disable ZOL
>    ARC: uaccess: drop CC_OPTIMIZE_FOR_SIZE
>    ARC: uaccess: elide ZOL, use double load/stores
>    ARCv2: memset: don't prefetch for len == 0 which happens a lot
>    ARCv2: memset: elide unaligned handling if hardware supports
>    ARCv2: memset: rewrite using double load/stores
>    ARC: string: use generic C code if no ZOL support
>    ARC: delay: elide ZOL
>    ARC: checksum: elide ZOL
>    ARC: head: elide ZOL
>    ARC: build: inhibit ZOL generation by compiler
>    ARC: pt_regs: handle the case when ZOL is not supported
>
>   arch/arc/Kconfig                           |  10 ++
>   arch/arc/Makefile                          |   3 +
>   arch/arc/include/asm/asm-macro-dbnz-emul.h |  12 ++
>   arch/arc/include/asm/asm-macro-dbnz.h      |   8 ++
>   arch/arc/include/asm/asm-macro-ll64-emul.h |  31 +++++
>   arch/arc/include/asm/asm-macro-ll64.h      |  20 +++
>   arch/arc/include/asm/assembler.h           |  41 ++++++
>   arch/arc/include/asm/checksum.h            |  58 +++++++-
>   arch/arc/include/asm/delay.h               |  16 +++
>   arch/arc/include/asm/entry-arcv2.h         |   4 +
>   arch/arc/include/asm/entry.h               |   2 +
>   arch/arc/include/asm/ptrace.h              |   4 +-
>   arch/arc/include/asm/string.h              |  15 ++-
>   arch/arc/include/asm/uaccess.h             |  29 ++--
>   arch/arc/kernel/arcksyms.c                 |   2 +
>   arch/arc/kernel/asm-offsets.c              |   2 +
>   arch/arc/kernel/disasm.c                   |   2 +
>   arch/arc/kernel/head.S                     |   8 +-
>   arch/arc/kernel/intc-arcv2.c               |   2 +
>   arch/arc/kernel/kgdb.c                     |   4 +
>   arch/arc/kernel/process.c                  |   2 +
>   arch/arc/kernel/ptrace.c                   |  12 ++
>   arch/arc/kernel/signal.c                   |   8 ++
>   arch/arc/kernel/troubleshoot.c             |   3 +
>   arch/arc/kernel/unaligned.c                |   2 +
>   arch/arc/kernel/vmlinux.lds.S              |   2 +-
>   arch/arc/lib/Makefile                      |   6 +
>   arch/arc/lib/memset-archs.S                | 147 +++++++++------------
>   arch/arc/lib/uaccess.S                     | 144 ++++++++++++++++++++
>   arch/arc/mm/extable.c                      |  11 --
>   30 files changed, 493 insertions(+), 117 deletions(-)
>   create mode 100644 arch/arc/include/asm/asm-macro-dbnz-emul.h
>   create mode 100644 arch/arc/include/asm/asm-macro-dbnz.h
>   create mode 100644 arch/arc/include/asm/asm-macro-ll64-emul.h
>   create mode 100644 arch/arc/include/asm/asm-macro-ll64.h
>   create mode 100644 arch/arc/include/asm/assembler.h
>   create mode 100644 arch/arc/lib/uaccess.S
>


_______________________________________________
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC PATCH 00/13] ARC: handle the lack of ZOL support
  2022-02-28  2:09 ` [RFC PATCH 00/13] ARC: handle the lack of ZOL support Vineet Gupta
@ 2022-03-03 19:22   ` Sergey Matyukevich
  2022-03-23 10:09   ` [RFC PATCH 00/13] ARC: handle the lack of ZOL supporty Sergey Matyukevich
  1 sibling, 0 replies; 17+ messages in thread
From: Sergey Matyukevich @ 2022-03-03 19:22 UTC (permalink / raw)
  To: Vineet Gupta; +Cc: linux-snps-arc, Vladimir Isaev, Sergey Matyukevich

Hi Vineet,

> Thx for doing this. I think the series mixes a few things not related to ZOL
> removal - the changelog for removal of -Os specific code seems incorrect
> etc.
> Let me repost with slight more cleanups.

Great. I will help with testing of the updated version.

Regards,
Sergey

_______________________________________________
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC PATCH 00/13] ARC: handle the lack of ZOL supporty
  2022-02-28  2:09 ` [RFC PATCH 00/13] ARC: handle the lack of ZOL support Vineet Gupta
  2022-03-03 19:22   ` Sergey Matyukevich
@ 2022-03-23 10:09   ` Sergey Matyukevich
  1 sibling, 0 replies; 17+ messages in thread
From: Sergey Matyukevich @ 2022-03-23 10:09 UTC (permalink / raw)
  To: Vineet Gupta; +Cc: linux-snps-arc, Vladimir Isaev, Sergey Matyukevich

Hi Vineet, 

> Thx for doing this. I think the series mixes a few things not related to ZOL
> removal - the changelog for removal of -Os specific code seems incorrect
> etc.
> Let me repost with slight more cleanups.

Let me know if you don't have capacity to work on v2 at the moment.
In this case you may leave it to me providing review comments for
the commits that need to be improved.

Regards,
Sergey

_______________________________________________
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2022-03-23 10:09 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-22 14:14 [RFC PATCH 00/13] ARC: handle the lack of ZOL support Sergey Matyukevich
2022-02-22 14:14 ` [RFC PATCH 01/13] ARC: uaccess: elide unaligned handling if hardware supports Sergey Matyukevich
2022-02-22 14:14 ` [RFC PATCH 02/13] ARC: Kconfig: introduce option to disable ZOL Sergey Matyukevich
2022-02-22 14:14 ` [RFC PATCH 03/13] ARC: uaccess: drop CC_OPTIMIZE_FOR_SIZE Sergey Matyukevich
2022-02-22 14:14 ` [RFC PATCH 04/13] ARC: uaccess: elide ZOL, use double load/stores Sergey Matyukevich
2022-02-22 14:14 ` [RFC PATCH 05/13] ARCv2: memset: don't prefetch for len == 0 which happens a lot Sergey Matyukevich
2022-02-22 14:14 ` [RFC PATCH 06/13] ARCv2: memset: elide unaligned handling if hardware supports Sergey Matyukevich
2022-02-22 14:15 ` [RFC PATCH 07/13] ARCv2: memset: rewrite using double load/stores Sergey Matyukevich
2022-02-22 14:15 ` [RFC PATCH 08/13] ARC: string: use generic C code if no ZOL support Sergey Matyukevich
2022-02-22 14:15 ` [RFC PATCH 09/13] ARC: delay: elide ZOL Sergey Matyukevich
2022-02-22 14:15 ` [RFC PATCH 10/13] ARC: checksum: " Sergey Matyukevich
2022-02-22 14:15 ` [RFC PATCH 11/13] ARC: head: " Sergey Matyukevich
2022-02-22 14:15 ` [RFC PATCH 12/13] ARC: build: inhibit ZOL generation by compiler Sergey Matyukevich
2022-02-22 14:15 ` [RFC PATCH 13/13] ARC: pt_regs: handle the case when ZOL is not supported Sergey Matyukevich
2022-02-28  2:09 ` [RFC PATCH 00/13] ARC: handle the lack of ZOL support Vineet Gupta
2022-03-03 19:22   ` Sergey Matyukevich
2022-03-23 10:09   ` [RFC PATCH 00/13] ARC: handle the lack of ZOL supporty Sergey Matyukevich

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.