All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 0/7] Optimize jump label implementation for ARM64
@ 2013-10-18 15:19 ` Jiang Liu
  0 siblings, 0 replies; 54+ messages in thread
From: Jiang Liu @ 2013-10-18 15:19 UTC (permalink / raw)
  To: Steven Rostedt, Catalin Marinas, Will Deacon, Sandeepa Prabhu
  Cc: Jiang Liu, linux-arm-kernel, linux-kernel

From: Jiang Liu <jiang.liu@huawei.com>

This patchset tries to optimize arch specfic jump label implementation
for ARM64 by dynamic kernel text patching.

To enable this feature, your toolchain must support "asm goto" extension
and "%c" constraint extesion. Current GCC for AARCH64 doesn't support
"%c", and there's a patch for it now:
http://gcc.gnu.org/ml/gcc-patches/2013-10/msg01314.html

It has been tested on ARM Fast mode and a real hardware platform.

Any comments are welcomed!

V4->V5:
1) rework the stop_machine() related code
2) rework aarch64_insn_gen_nop()

V3->V4:
1) resolve a race condition in kernel text patching
2) address other review comments

V2->V3:
1) fix a bug in comparing signed and unsigned values
2) detect big endian by checking __AARCH64EB__

V1->V2: address review comments of V1
1) refine comments
2) add a new interface to always synchronize with stop_machine()
   when patching code
3) handle endian issue when patching code

Jiang Liu (7):
  arm64: introduce basic aarch64 instruction decoding helpers
  arm64: introduce interfaces to hotpatch kernel and module code
  arm64: move encode_insn_immediate() from module.c to insn.c
  arm64: introduce aarch64_insn_gen_{nop|branch_imm}() helper functions
  arm64, jump label: detect %c support for ARM64
  arm64, jump label: optimize jump label implementation
  jump_label: use defined macros instead of hard-coding for better
    readability

 arch/arm64/Kconfig                  |   1 +
 arch/arm64/include/asm/insn.h       | 121 +++++++++++++++++
 arch/arm64/include/asm/jump_label.h |  52 +++++++
 arch/arm64/kernel/Makefile          |   3 +-
 arch/arm64/kernel/insn.c            | 263 ++++++++++++++++++++++++++++++++++++
 arch/arm64/kernel/jump_label.c      |  58 ++++++++
 arch/arm64/kernel/module.c          | 153 +++++++--------------
 include/linux/jump_label.h          |  15 +-
 scripts/gcc-goto.sh                 |   2 +-
 9 files changed, 554 insertions(+), 114 deletions(-)
 create mode 100644 arch/arm64/include/asm/insn.h
 create mode 100644 arch/arm64/include/asm/jump_label.h
 create mode 100644 arch/arm64/kernel/insn.c
 create mode 100644 arch/arm64/kernel/jump_label.c

-- 
1.8.1.2


^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH v5 0/7] Optimize jump label implementation for ARM64
@ 2013-10-18 15:19 ` Jiang Liu
  0 siblings, 0 replies; 54+ messages in thread
From: Jiang Liu @ 2013-10-18 15:19 UTC (permalink / raw)
  To: linux-arm-kernel

From: Jiang Liu <jiang.liu@huawei.com>

This patchset tries to optimize arch specfic jump label implementation
for ARM64 by dynamic kernel text patching.

To enable this feature, your toolchain must support "asm goto" extension
and "%c" constraint extesion. Current GCC for AARCH64 doesn't support
"%c", and there's a patch for it now:
http://gcc.gnu.org/ml/gcc-patches/2013-10/msg01314.html

It has been tested on ARM Fast mode and a real hardware platform.

Any comments are welcomed!

V4->V5:
1) rework the stop_machine() related code
2) rework aarch64_insn_gen_nop()

V3->V4:
1) resolve a race condition in kernel text patching
2) address other review comments

V2->V3:
1) fix a bug in comparing signed and unsigned values
2) detect big endian by checking __AARCH64EB__

V1->V2: address review comments of V1
1) refine comments
2) add a new interface to always synchronize with stop_machine()
   when patching code
3) handle endian issue when patching code

Jiang Liu (7):
  arm64: introduce basic aarch64 instruction decoding helpers
  arm64: introduce interfaces to hotpatch kernel and module code
  arm64: move encode_insn_immediate() from module.c to insn.c
  arm64: introduce aarch64_insn_gen_{nop|branch_imm}() helper functions
  arm64, jump label: detect %c support for ARM64
  arm64, jump label: optimize jump label implementation
  jump_label: use defined macros instead of hard-coding for better
    readability

 arch/arm64/Kconfig                  |   1 +
 arch/arm64/include/asm/insn.h       | 121 +++++++++++++++++
 arch/arm64/include/asm/jump_label.h |  52 +++++++
 arch/arm64/kernel/Makefile          |   3 +-
 arch/arm64/kernel/insn.c            | 263 ++++++++++++++++++++++++++++++++++++
 arch/arm64/kernel/jump_label.c      |  58 ++++++++
 arch/arm64/kernel/module.c          | 153 +++++++--------------
 include/linux/jump_label.h          |  15 +-
 scripts/gcc-goto.sh                 |   2 +-
 9 files changed, 554 insertions(+), 114 deletions(-)
 create mode 100644 arch/arm64/include/asm/insn.h
 create mode 100644 arch/arm64/include/asm/jump_label.h
 create mode 100644 arch/arm64/kernel/insn.c
 create mode 100644 arch/arm64/kernel/jump_label.c

-- 
1.8.1.2

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH v5 1/7] arm64: introduce basic aarch64 instruction decoding helpers
  2013-10-18 15:19 ` Jiang Liu
@ 2013-10-18 15:19   ` Jiang Liu
  -1 siblings, 0 replies; 54+ messages in thread
From: Jiang Liu @ 2013-10-18 15:19 UTC (permalink / raw)
  To: Steven Rostedt, Catalin Marinas, Will Deacon, Sandeepa Prabhu,
	Jiang Liu, Marc Zyngier, Arnd Bergmann, linux-arm-kernel,
	linux-kernel
  Cc: Jiang Liu

From: Jiang Liu <jiang.liu@huawei.com>

Introduce basic aarch64 instruction decoding helper
aarch64_get_insn_class() and aarch64_insn_hotpatch_safe().

Reviewed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: Jiang Liu <liuj97@gmail.com>
---
 arch/arm64/include/asm/insn.h | 78 +++++++++++++++++++++++++++++++++++++
 arch/arm64/kernel/Makefile    |  2 +-
 arch/arm64/kernel/insn.c      | 90 +++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 169 insertions(+), 1 deletion(-)
 create mode 100644 arch/arm64/include/asm/insn.h
 create mode 100644 arch/arm64/kernel/insn.c

diff --git a/arch/arm64/include/asm/insn.h b/arch/arm64/include/asm/insn.h
new file mode 100644
index 0000000..7499490
--- /dev/null
+++ b/arch/arm64/include/asm/insn.h
@@ -0,0 +1,78 @@
+/*
+ * Copyright (C) 2013 Huawei Ltd.
+ * Author: Jiang Liu <jiang.liu@huawei.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef	__ASM_INSN_H
+#define	__ASM_INSN_H
+#include <linux/types.h>
+
+/*
+ * ARM Architecture Reference Manual ARMv8, Section C3.1
+ * AArch64 main encoding table
+ *  Bit position
+ *   28 27 26 25	Encoding Group
+ *   0  0  -  -		Unallocated
+ *   1  0  0  -		Data processing, immediate
+ *   1  0  1  -		Branch, exception generation and system instructions
+ *   -  1  -  0		Loads and stores
+ *   -  1  0  1		Data processing - register
+ *   0  1  1  1		Data processing - SIMD and floating point
+ *   1  1  1  1		Data processing - SIMD and floating point
+ * "-" means "don't care"
+ */
+enum aarch64_insn_encoding_class {
+	AARCH64_INSN_CLS_UNKNOWN,	/* UNALLOCATED */
+	AARCH64_INSN_CLS_DP_IMM,	/* Data processing - immediate */
+	AARCH64_INSN_CLS_DP_REG,	/* Data processing - register */
+	AARCH64_INSN_CLS_DP_FPSIMD,	/* Data processing - SIMD and FP */
+	AARCH64_INSN_CLS_LDST,		/* Loads and stores */
+	AARCH64_INSN_CLS_BR_SYS,	/* Branch, exception generation and
+					 * system instructions */
+};
+
+#define	__AARCH64_INSN_FUNCS(abbr, mask, val)	\
+static __always_inline bool aarch64_insn_is_##abbr(u32 code) \
+{ return (code & (mask)) == (val); }	\
+static __always_inline u32 aarch64_insn_get_##abbr##_mask(void) \
+{ return (mask); } \
+static __always_inline u32 aarch64_insn_get_##abbr##_value(void) \
+{ return (val); }
+
+__AARCH64_INSN_FUNCS(b,		0xFC000000, 0x14000000)
+__AARCH64_INSN_FUNCS(bl,	0xFC000000, 0x94000000)
+__AARCH64_INSN_FUNCS(svc,	0xFFE0001F, 0xD4000001)
+__AARCH64_INSN_FUNCS(hvc,	0xFFE0001F, 0xD4000002)
+__AARCH64_INSN_FUNCS(smc,	0xFFE0001F, 0xD4000003)
+__AARCH64_INSN_FUNCS(brk,	0xFFE0001F, 0xD4200000)
+__AARCH64_INSN_FUNCS(hint,	0xFFFFF01F, 0xD503201F)
+
+#undef	__AARCH64_INSN_FUNCS
+
+enum aarch64_insn_hint_op {
+	AARCH64_INSN_HINT_NOP	= 0x0 << 5,
+	AARCH64_INSN_HINT_YIELD	= 0x1 << 5,
+	AARCH64_INSN_HINT_WFE	= 0x2 << 5,
+	AARCH64_INSN_HINT_WFI	= 0x3 << 5,
+	AARCH64_INSN_HINT_SEV	= 0x4 << 5,
+	AARCH64_INSN_HINT_SEVL	= 0x5 << 5,
+};
+
+bool aarch64_insn_is_nop(u32 insn);
+
+enum aarch64_insn_encoding_class aarch64_get_insn_class(u32 insn);
+
+bool aarch64_insn_hotpatch_safe(u32 old_insn, u32 new_insn);
+
+#endif	/* __ASM_INSN_H */
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index 7b4b564..9af6cb3 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -9,7 +9,7 @@ AFLAGS_head.o		:= -DTEXT_OFFSET=$(TEXT_OFFSET)
 arm64-obj-y		:= cputable.o debug-monitors.o entry.o irq.o fpsimd.o	\
 			   entry-fpsimd.o process.o ptrace.o setup.o signal.o	\
 			   sys.o stacktrace.o time.o traps.o io.o vdso.o	\
-			   hyp-stub.o psci.o
+			   hyp-stub.o psci.o insn.o
 
 arm64-obj-$(CONFIG_COMPAT)		+= sys32.o kuser32.o signal32.o 	\
 					   sys_compat.o
diff --git a/arch/arm64/kernel/insn.c b/arch/arm64/kernel/insn.c
new file mode 100644
index 0000000..f5b779f
--- /dev/null
+++ b/arch/arm64/kernel/insn.c
@@ -0,0 +1,90 @@
+/*
+ * Copyright (C) 2013 Huawei Ltd.
+ * Author: Jiang Liu <jiang.liu@huawei.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#include <linux/compiler.h>
+#include <linux/kernel.h>
+#include <asm/insn.h>
+
+static int aarch64_insn_encoding_class[] = {
+	AARCH64_INSN_CLS_UNKNOWN,
+	AARCH64_INSN_CLS_UNKNOWN,
+	AARCH64_INSN_CLS_UNKNOWN,
+	AARCH64_INSN_CLS_UNKNOWN,
+	AARCH64_INSN_CLS_LDST,
+	AARCH64_INSN_CLS_DP_REG,
+	AARCH64_INSN_CLS_LDST,
+	AARCH64_INSN_CLS_DP_FPSIMD,
+	AARCH64_INSN_CLS_DP_IMM,
+	AARCH64_INSN_CLS_DP_IMM,
+	AARCH64_INSN_CLS_BR_SYS,
+	AARCH64_INSN_CLS_BR_SYS,
+	AARCH64_INSN_CLS_LDST,
+	AARCH64_INSN_CLS_DP_REG,
+	AARCH64_INSN_CLS_LDST,
+	AARCH64_INSN_CLS_DP_FPSIMD,
+};
+
+enum aarch64_insn_encoding_class __kprobes aarch64_get_insn_class(u32 insn)
+{
+	return aarch64_insn_encoding_class[(insn >> 25) & 0xf];
+}
+
+/* NOP is an alias of HINT */
+bool aarch64_insn_is_nop(u32 insn)
+{
+	if (!aarch64_insn_is_hint(insn))
+		return false;
+
+	switch (insn & 0xFE0) {
+	case AARCH64_INSN_HINT_YIELD:
+	case AARCH64_INSN_HINT_WFE:
+	case AARCH64_INSN_HINT_WFI:
+	case AARCH64_INSN_HINT_SEV:
+	case AARCH64_INSN_HINT_SEVL:
+		return false;
+	default:
+		return true;
+	}
+}
+
+static bool __kprobes __aarch64_insn_hotpatch_safe(u32 insn)
+{
+	if (aarch64_get_insn_class(insn) != AARCH64_INSN_CLS_BR_SYS)
+		return false;
+
+	return	aarch64_insn_is_b(insn) ||
+		aarch64_insn_is_bl(insn) ||
+		aarch64_insn_is_svc(insn) ||
+		aarch64_insn_is_hvc(insn) ||
+		aarch64_insn_is_smc(insn) ||
+		aarch64_insn_is_brk(insn) ||
+		aarch64_insn_is_nop(insn);
+}
+
+/*
+ * ARMv8-A Section B2.6.5:
+ * Concurrent modification and execution of instructions can lead to the
+ * resulting instruction performing any behavior that can be achieved by
+ * executing any sequence of instructions that can be executed from the
+ * same Exception level, except where the instruction before modification
+ * and the instruction after modification is a B, BL, NOP, BKPT, SVC, HVC,
+ * or SMC instruction.
+ */
+bool __kprobes aarch64_insn_hotpatch_safe(u32 old_insn, u32 new_insn)
+{
+	return __aarch64_insn_hotpatch_safe(old_insn) &&
+	       __aarch64_insn_hotpatch_safe(new_insn);
+}
-- 
1.8.1.2


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v5 1/7] arm64: introduce basic aarch64 instruction decoding helpers
@ 2013-10-18 15:19   ` Jiang Liu
  0 siblings, 0 replies; 54+ messages in thread
From: Jiang Liu @ 2013-10-18 15:19 UTC (permalink / raw)
  To: linux-arm-kernel

From: Jiang Liu <jiang.liu@huawei.com>

Introduce basic aarch64 instruction decoding helper
aarch64_get_insn_class() and aarch64_insn_hotpatch_safe().

Reviewed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: Jiang Liu <liuj97@gmail.com>
---
 arch/arm64/include/asm/insn.h | 78 +++++++++++++++++++++++++++++++++++++
 arch/arm64/kernel/Makefile    |  2 +-
 arch/arm64/kernel/insn.c      | 90 +++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 169 insertions(+), 1 deletion(-)
 create mode 100644 arch/arm64/include/asm/insn.h
 create mode 100644 arch/arm64/kernel/insn.c

diff --git a/arch/arm64/include/asm/insn.h b/arch/arm64/include/asm/insn.h
new file mode 100644
index 0000000..7499490
--- /dev/null
+++ b/arch/arm64/include/asm/insn.h
@@ -0,0 +1,78 @@
+/*
+ * Copyright (C) 2013 Huawei Ltd.
+ * Author: Jiang Liu <jiang.liu@huawei.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef	__ASM_INSN_H
+#define	__ASM_INSN_H
+#include <linux/types.h>
+
+/*
+ * ARM Architecture Reference Manual ARMv8, Section C3.1
+ * AArch64 main encoding table
+ *  Bit position
+ *   28 27 26 25	Encoding Group
+ *   0  0  -  -		Unallocated
+ *   1  0  0  -		Data processing, immediate
+ *   1  0  1  -		Branch, exception generation and system instructions
+ *   -  1  -  0		Loads and stores
+ *   -  1  0  1		Data processing - register
+ *   0  1  1  1		Data processing - SIMD and floating point
+ *   1  1  1  1		Data processing - SIMD and floating point
+ * "-" means "don't care"
+ */
+enum aarch64_insn_encoding_class {
+	AARCH64_INSN_CLS_UNKNOWN,	/* UNALLOCATED */
+	AARCH64_INSN_CLS_DP_IMM,	/* Data processing - immediate */
+	AARCH64_INSN_CLS_DP_REG,	/* Data processing - register */
+	AARCH64_INSN_CLS_DP_FPSIMD,	/* Data processing - SIMD and FP */
+	AARCH64_INSN_CLS_LDST,		/* Loads and stores */
+	AARCH64_INSN_CLS_BR_SYS,	/* Branch, exception generation and
+					 * system instructions */
+};
+
+#define	__AARCH64_INSN_FUNCS(abbr, mask, val)	\
+static __always_inline bool aarch64_insn_is_##abbr(u32 code) \
+{ return (code & (mask)) == (val); }	\
+static __always_inline u32 aarch64_insn_get_##abbr##_mask(void) \
+{ return (mask); } \
+static __always_inline u32 aarch64_insn_get_##abbr##_value(void) \
+{ return (val); }
+
+__AARCH64_INSN_FUNCS(b,		0xFC000000, 0x14000000)
+__AARCH64_INSN_FUNCS(bl,	0xFC000000, 0x94000000)
+__AARCH64_INSN_FUNCS(svc,	0xFFE0001F, 0xD4000001)
+__AARCH64_INSN_FUNCS(hvc,	0xFFE0001F, 0xD4000002)
+__AARCH64_INSN_FUNCS(smc,	0xFFE0001F, 0xD4000003)
+__AARCH64_INSN_FUNCS(brk,	0xFFE0001F, 0xD4200000)
+__AARCH64_INSN_FUNCS(hint,	0xFFFFF01F, 0xD503201F)
+
+#undef	__AARCH64_INSN_FUNCS
+
+enum aarch64_insn_hint_op {
+	AARCH64_INSN_HINT_NOP	= 0x0 << 5,
+	AARCH64_INSN_HINT_YIELD	= 0x1 << 5,
+	AARCH64_INSN_HINT_WFE	= 0x2 << 5,
+	AARCH64_INSN_HINT_WFI	= 0x3 << 5,
+	AARCH64_INSN_HINT_SEV	= 0x4 << 5,
+	AARCH64_INSN_HINT_SEVL	= 0x5 << 5,
+};
+
+bool aarch64_insn_is_nop(u32 insn);
+
+enum aarch64_insn_encoding_class aarch64_get_insn_class(u32 insn);
+
+bool aarch64_insn_hotpatch_safe(u32 old_insn, u32 new_insn);
+
+#endif	/* __ASM_INSN_H */
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index 7b4b564..9af6cb3 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -9,7 +9,7 @@ AFLAGS_head.o		:= -DTEXT_OFFSET=$(TEXT_OFFSET)
 arm64-obj-y		:= cputable.o debug-monitors.o entry.o irq.o fpsimd.o	\
 			   entry-fpsimd.o process.o ptrace.o setup.o signal.o	\
 			   sys.o stacktrace.o time.o traps.o io.o vdso.o	\
-			   hyp-stub.o psci.o
+			   hyp-stub.o psci.o insn.o
 
 arm64-obj-$(CONFIG_COMPAT)		+= sys32.o kuser32.o signal32.o 	\
 					   sys_compat.o
diff --git a/arch/arm64/kernel/insn.c b/arch/arm64/kernel/insn.c
new file mode 100644
index 0000000..f5b779f
--- /dev/null
+++ b/arch/arm64/kernel/insn.c
@@ -0,0 +1,90 @@
+/*
+ * Copyright (C) 2013 Huawei Ltd.
+ * Author: Jiang Liu <jiang.liu@huawei.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#include <linux/compiler.h>
+#include <linux/kernel.h>
+#include <asm/insn.h>
+
+static int aarch64_insn_encoding_class[] = {
+	AARCH64_INSN_CLS_UNKNOWN,
+	AARCH64_INSN_CLS_UNKNOWN,
+	AARCH64_INSN_CLS_UNKNOWN,
+	AARCH64_INSN_CLS_UNKNOWN,
+	AARCH64_INSN_CLS_LDST,
+	AARCH64_INSN_CLS_DP_REG,
+	AARCH64_INSN_CLS_LDST,
+	AARCH64_INSN_CLS_DP_FPSIMD,
+	AARCH64_INSN_CLS_DP_IMM,
+	AARCH64_INSN_CLS_DP_IMM,
+	AARCH64_INSN_CLS_BR_SYS,
+	AARCH64_INSN_CLS_BR_SYS,
+	AARCH64_INSN_CLS_LDST,
+	AARCH64_INSN_CLS_DP_REG,
+	AARCH64_INSN_CLS_LDST,
+	AARCH64_INSN_CLS_DP_FPSIMD,
+};
+
+enum aarch64_insn_encoding_class __kprobes aarch64_get_insn_class(u32 insn)
+{
+	return aarch64_insn_encoding_class[(insn >> 25) & 0xf];
+}
+
+/* NOP is an alias of HINT */
+bool aarch64_insn_is_nop(u32 insn)
+{
+	if (!aarch64_insn_is_hint(insn))
+		return false;
+
+	switch (insn & 0xFE0) {
+	case AARCH64_INSN_HINT_YIELD:
+	case AARCH64_INSN_HINT_WFE:
+	case AARCH64_INSN_HINT_WFI:
+	case AARCH64_INSN_HINT_SEV:
+	case AARCH64_INSN_HINT_SEVL:
+		return false;
+	default:
+		return true;
+	}
+}
+
+static bool __kprobes __aarch64_insn_hotpatch_safe(u32 insn)
+{
+	if (aarch64_get_insn_class(insn) != AARCH64_INSN_CLS_BR_SYS)
+		return false;
+
+	return	aarch64_insn_is_b(insn) ||
+		aarch64_insn_is_bl(insn) ||
+		aarch64_insn_is_svc(insn) ||
+		aarch64_insn_is_hvc(insn) ||
+		aarch64_insn_is_smc(insn) ||
+		aarch64_insn_is_brk(insn) ||
+		aarch64_insn_is_nop(insn);
+}
+
+/*
+ * ARMv8-A Section B2.6.5:
+ * Concurrent modification and execution of instructions can lead to the
+ * resulting instruction performing any behavior that can be achieved by
+ * executing any sequence of instructions that can be executed from the
+ * same Exception level, except where the instruction before modification
+ * and the instruction after modification is a B, BL, NOP, BKPT, SVC, HVC,
+ * or SMC instruction.
+ */
+bool __kprobes aarch64_insn_hotpatch_safe(u32 old_insn, u32 new_insn)
+{
+	return __aarch64_insn_hotpatch_safe(old_insn) &&
+	       __aarch64_insn_hotpatch_safe(new_insn);
+}
-- 
1.8.1.2

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v5 2/7] arm64: introduce interfaces to hotpatch kernel and module code
  2013-10-18 15:19 ` Jiang Liu
@ 2013-10-18 15:19   ` Jiang Liu
  -1 siblings, 0 replies; 54+ messages in thread
From: Jiang Liu @ 2013-10-18 15:19 UTC (permalink / raw)
  To: Steven Rostedt, Catalin Marinas, Will Deacon, Sandeepa Prabhu,
	Jiang Liu, linux-arm-kernel, linux-kernel
  Cc: Jiang Liu

From: Jiang Liu <jiang.liu@huawei.com>

Introduce three interfaces to patch kernel and module code:
aarch64_insn_patch_text_nosync():
	patch code without synchronization, it's caller's responsibility
	to synchronize all CPUs if needed.
aarch64_insn_patch_text_sync():
	patch code and always synchronize with stop_machine()
aarch64_insn_patch_text():
	patch code and synchronize with stop_machine() if needed

Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: Jiang Liu <liuj97@gmail.com>
---
 arch/arm64/include/asm/insn.h | 19 ++++++++-
 arch/arm64/kernel/insn.c      | 91 +++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 109 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/insn.h b/arch/arm64/include/asm/insn.h
index 7499490..7a69491 100644
--- a/arch/arm64/include/asm/insn.h
+++ b/arch/arm64/include/asm/insn.h
@@ -71,8 +71,25 @@ enum aarch64_insn_hint_op {
 
 bool aarch64_insn_is_nop(u32 insn);
 
-enum aarch64_insn_encoding_class aarch64_get_insn_class(u32 insn);
+/*
+ * In ARMv8-A, A64 instructions have a fixed length of 32 bits and are always
+ * little-endian.
+ */
+static __always_inline u32 aarch64_insn_read(void *addr)
+{
+	return le32_to_cpu(*(u32 *)addr);
+}
 
+static __always_inline void aarch64_insn_write(void *addr, u32 insn)
+{
+	*(u32 *)addr = cpu_to_le32(insn);
+}
+
+enum aarch64_insn_encoding_class aarch64_get_insn_class(u32 insn);
 bool aarch64_insn_hotpatch_safe(u32 old_insn, u32 new_insn);
 
+int aarch64_insn_patch_text_nosync(void *addr, u32 insn);
+int aarch64_insn_patch_text_sync(void *addrs[], u32 insns[], int cnt);
+int aarch64_insn_patch_text(void *addrs[], u32 insns[], int cnt);
+
 #endif	/* __ASM_INSN_H */
diff --git a/arch/arm64/kernel/insn.c b/arch/arm64/kernel/insn.c
index f5b779f..3879db4 100644
--- a/arch/arm64/kernel/insn.c
+++ b/arch/arm64/kernel/insn.c
@@ -16,6 +16,9 @@
  */
 #include <linux/compiler.h>
 #include <linux/kernel.h>
+#include <linux/smp.h>
+#include <linux/stop_machine.h>
+#include <asm/cacheflush.h>
 #include <asm/insn.h>
 
 static int aarch64_insn_encoding_class[] = {
@@ -88,3 +91,91 @@ bool __kprobes aarch64_insn_hotpatch_safe(u32 old_insn, u32 new_insn)
 	return __aarch64_insn_hotpatch_safe(old_insn) &&
 	       __aarch64_insn_hotpatch_safe(new_insn);
 }
+
+int __kprobes aarch64_insn_patch_text_nosync(void *addr, u32 insn)
+{
+	u32 *tp = addr;
+
+	/* A64 instructions must be word aligned */
+	if ((uintptr_t)tp & 0x3)
+		return -EINVAL;
+
+	aarch64_insn_write(tp, insn);
+	flush_icache_range((uintptr_t)tp, (uintptr_t)tp + sizeof(u32));
+
+	return 0;
+}
+
+struct aarch64_insn_patch {
+	void	**text_addrs;
+	u32	*new_insns;
+	int	insn_cnt;
+};
+
+static DEFINE_MUTEX(text_patch_lock);
+static atomic_t text_patch_id;
+
+static int __kprobes aarch64_insn_patch_text_cb(void *arg)
+{
+	int i, ret = 0;
+	struct aarch64_insn_patch *pp = arg;
+
+	if (atomic_read(&text_patch_id) == smp_processor_id()) {
+		for (i = 0; ret == 0 && i < pp->insn_cnt; i++)
+			ret = aarch64_insn_patch_text_nosync(pp->text_addrs[i],
+							     pp->new_insns[i]);
+		/* Let other CPU continue */
+		atomic_set(&text_patch_id, -1);
+	} else {
+		while (atomic_read(&text_patch_id) != -1)
+			cpu_relax();
+		isb();
+	}
+
+	return ret;
+}
+
+int __kprobes aarch64_insn_patch_text_sync(void *addrs[], u32 insns[], int cnt)
+{
+	int ret;
+	struct aarch64_insn_patch patch = {
+		.text_addrs = addrs,
+		.new_insns = insns,
+		.insn_cnt = cnt,
+	};
+
+	if (cnt <= 0)
+		return -EINVAL;
+
+	mutex_lock(&text_patch_lock);
+	atomic_set(&text_patch_id, smp_processor_id());
+	ret = stop_machine(aarch64_insn_patch_text_cb, &patch, cpu_online_mask);
+	mutex_unlock(&text_patch_lock);
+
+	return ret;
+}
+
+int __kprobes aarch64_insn_patch_text(void *addrs[], u32 insns[], int cnt)
+{
+	int ret;
+	bool safe = false;
+
+	if (cnt == 1)
+		safe = aarch64_insn_hotpatch_safe(aarch64_insn_read(addrs[0]),
+						  insns[0]);
+
+	if (safe) {
+		/*
+		 * ARMv8 architecture doesn't guarantee all CPUs see the new
+		 * instruction after returning from function
+		 * aarch64_insn_patch_text_nosync(). So send a IPI to all other
+		 * CPUs to achieve instruction synchronization.
+		 */
+		ret = aarch64_insn_patch_text_nosync(addrs[0], insns[0]);
+		kick_all_cpus_sync();
+	} else {
+		ret = aarch64_insn_patch_text_sync(addrs, insns, cnt);
+	}
+
+	return ret;
+}
-- 
1.8.1.2


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v5 2/7] arm64: introduce interfaces to hotpatch kernel and module code
@ 2013-10-18 15:19   ` Jiang Liu
  0 siblings, 0 replies; 54+ messages in thread
From: Jiang Liu @ 2013-10-18 15:19 UTC (permalink / raw)
  To: linux-arm-kernel

From: Jiang Liu <jiang.liu@huawei.com>

Introduce three interfaces to patch kernel and module code:
aarch64_insn_patch_text_nosync():
	patch code without synchronization, it's caller's responsibility
	to synchronize all CPUs if needed.
aarch64_insn_patch_text_sync():
	patch code and always synchronize with stop_machine()
aarch64_insn_patch_text():
	patch code and synchronize with stop_machine() if needed

Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: Jiang Liu <liuj97@gmail.com>
---
 arch/arm64/include/asm/insn.h | 19 ++++++++-
 arch/arm64/kernel/insn.c      | 91 +++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 109 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/insn.h b/arch/arm64/include/asm/insn.h
index 7499490..7a69491 100644
--- a/arch/arm64/include/asm/insn.h
+++ b/arch/arm64/include/asm/insn.h
@@ -71,8 +71,25 @@ enum aarch64_insn_hint_op {
 
 bool aarch64_insn_is_nop(u32 insn);
 
-enum aarch64_insn_encoding_class aarch64_get_insn_class(u32 insn);
+/*
+ * In ARMv8-A, A64 instructions have a fixed length of 32 bits and are always
+ * little-endian.
+ */
+static __always_inline u32 aarch64_insn_read(void *addr)
+{
+	return le32_to_cpu(*(u32 *)addr);
+}
 
+static __always_inline void aarch64_insn_write(void *addr, u32 insn)
+{
+	*(u32 *)addr = cpu_to_le32(insn);
+}
+
+enum aarch64_insn_encoding_class aarch64_get_insn_class(u32 insn);
 bool aarch64_insn_hotpatch_safe(u32 old_insn, u32 new_insn);
 
+int aarch64_insn_patch_text_nosync(void *addr, u32 insn);
+int aarch64_insn_patch_text_sync(void *addrs[], u32 insns[], int cnt);
+int aarch64_insn_patch_text(void *addrs[], u32 insns[], int cnt);
+
 #endif	/* __ASM_INSN_H */
diff --git a/arch/arm64/kernel/insn.c b/arch/arm64/kernel/insn.c
index f5b779f..3879db4 100644
--- a/arch/arm64/kernel/insn.c
+++ b/arch/arm64/kernel/insn.c
@@ -16,6 +16,9 @@
  */
 #include <linux/compiler.h>
 #include <linux/kernel.h>
+#include <linux/smp.h>
+#include <linux/stop_machine.h>
+#include <asm/cacheflush.h>
 #include <asm/insn.h>
 
 static int aarch64_insn_encoding_class[] = {
@@ -88,3 +91,91 @@ bool __kprobes aarch64_insn_hotpatch_safe(u32 old_insn, u32 new_insn)
 	return __aarch64_insn_hotpatch_safe(old_insn) &&
 	       __aarch64_insn_hotpatch_safe(new_insn);
 }
+
+int __kprobes aarch64_insn_patch_text_nosync(void *addr, u32 insn)
+{
+	u32 *tp = addr;
+
+	/* A64 instructions must be word aligned */
+	if ((uintptr_t)tp & 0x3)
+		return -EINVAL;
+
+	aarch64_insn_write(tp, insn);
+	flush_icache_range((uintptr_t)tp, (uintptr_t)tp + sizeof(u32));
+
+	return 0;
+}
+
+struct aarch64_insn_patch {
+	void	**text_addrs;
+	u32	*new_insns;
+	int	insn_cnt;
+};
+
+static DEFINE_MUTEX(text_patch_lock);
+static atomic_t text_patch_id;
+
+static int __kprobes aarch64_insn_patch_text_cb(void *arg)
+{
+	int i, ret = 0;
+	struct aarch64_insn_patch *pp = arg;
+
+	if (atomic_read(&text_patch_id) == smp_processor_id()) {
+		for (i = 0; ret == 0 && i < pp->insn_cnt; i++)
+			ret = aarch64_insn_patch_text_nosync(pp->text_addrs[i],
+							     pp->new_insns[i]);
+		/* Let other CPU continue */
+		atomic_set(&text_patch_id, -1);
+	} else {
+		while (atomic_read(&text_patch_id) != -1)
+			cpu_relax();
+		isb();
+	}
+
+	return ret;
+}
+
+int __kprobes aarch64_insn_patch_text_sync(void *addrs[], u32 insns[], int cnt)
+{
+	int ret;
+	struct aarch64_insn_patch patch = {
+		.text_addrs = addrs,
+		.new_insns = insns,
+		.insn_cnt = cnt,
+	};
+
+	if (cnt <= 0)
+		return -EINVAL;
+
+	mutex_lock(&text_patch_lock);
+	atomic_set(&text_patch_id, smp_processor_id());
+	ret = stop_machine(aarch64_insn_patch_text_cb, &patch, cpu_online_mask);
+	mutex_unlock(&text_patch_lock);
+
+	return ret;
+}
+
+int __kprobes aarch64_insn_patch_text(void *addrs[], u32 insns[], int cnt)
+{
+	int ret;
+	bool safe = false;
+
+	if (cnt == 1)
+		safe = aarch64_insn_hotpatch_safe(aarch64_insn_read(addrs[0]),
+						  insns[0]);
+
+	if (safe) {
+		/*
+		 * ARMv8 architecture doesn't guarantee all CPUs see the new
+		 * instruction after returning from function
+		 * aarch64_insn_patch_text_nosync(). So send a IPI to all other
+		 * CPUs to achieve instruction synchronization.
+		 */
+		ret = aarch64_insn_patch_text_nosync(addrs[0], insns[0]);
+		kick_all_cpus_sync();
+	} else {
+		ret = aarch64_insn_patch_text_sync(addrs, insns, cnt);
+	}
+
+	return ret;
+}
-- 
1.8.1.2

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v5 3/7] arm64: move encode_insn_immediate() from module.c to insn.c
  2013-10-18 15:19 ` Jiang Liu
@ 2013-10-18 15:19   ` Jiang Liu
  -1 siblings, 0 replies; 54+ messages in thread
From: Jiang Liu @ 2013-10-18 15:19 UTC (permalink / raw)
  To: Steven Rostedt, Catalin Marinas, Will Deacon, Sandeepa Prabhu,
	Jiang Liu, linux-arm-kernel, linux-kernel
  Cc: Jiang Liu

From: Jiang Liu <jiang.liu@huawei.com>

Function encode_insn_immediate() will be used by other instruction
manipulate related functions, so move it into insn.c and rename it
as aarch64_insn_encode_immediate().

Reviewed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: Jiang Liu <liuj97@gmail.com>
---
 arch/arm64/include/asm/insn.h |  13 ++++
 arch/arm64/kernel/insn.c      |  54 +++++++++++++++
 arch/arm64/kernel/module.c    | 153 +++++++++++++-----------------------------
 3 files changed, 113 insertions(+), 107 deletions(-)

diff --git a/arch/arm64/include/asm/insn.h b/arch/arm64/include/asm/insn.h
index 7a69491..8f94e48 100644
--- a/arch/arm64/include/asm/insn.h
+++ b/arch/arm64/include/asm/insn.h
@@ -42,6 +42,17 @@ enum aarch64_insn_encoding_class {
 					 * system instructions */
 };
 
+enum aarch64_insn_imm_type {
+	AARCH64_INSN_IMM_ADR,
+	AARCH64_INSN_IMM_26,
+	AARCH64_INSN_IMM_19,
+	AARCH64_INSN_IMM_16,
+	AARCH64_INSN_IMM_14,
+	AARCH64_INSN_IMM_12,
+	AARCH64_INSN_IMM_9,
+	AARCH64_INSN_IMM_MAX
+};
+
 #define	__AARCH64_INSN_FUNCS(abbr, mask, val)	\
 static __always_inline bool aarch64_insn_is_##abbr(u32 code) \
 { return (code & (mask)) == (val); }	\
@@ -86,6 +97,8 @@ static __always_inline void aarch64_insn_write(void *addr, u32 insn)
 }
 
 enum aarch64_insn_encoding_class aarch64_get_insn_class(u32 insn);
+u32 aarch64_insn_encode_immediate(enum aarch64_insn_imm_type type,
+				  u32 insn, u64 imm);
 bool aarch64_insn_hotpatch_safe(u32 old_insn, u32 new_insn);
 
 int aarch64_insn_patch_text_nosync(void *addr, u32 insn);
diff --git a/arch/arm64/kernel/insn.c b/arch/arm64/kernel/insn.c
index 3879db4..886821f 100644
--- a/arch/arm64/kernel/insn.c
+++ b/arch/arm64/kernel/insn.c
@@ -179,3 +179,57 @@ int __kprobes aarch64_insn_patch_text(void *addrs[], u32 insns[], int cnt)
 
 	return ret;
 }
+
+u32 aarch64_insn_encode_immediate(enum aarch64_insn_imm_type type,
+				  u32 insn, u64 imm)
+{
+	u32 immlo, immhi, lomask, himask, mask;
+	int shift;
+
+	switch (type) {
+	case AARCH64_INSN_IMM_ADR:
+		lomask = 0x3;
+		himask = 0x7ffff;
+		immlo = imm & lomask;
+		imm >>= 2;
+		immhi = imm & himask;
+		imm = (immlo << 24) | (immhi);
+		mask = (lomask << 24) | (himask);
+		shift = 5;
+		break;
+	case AARCH64_INSN_IMM_26:
+		mask = BIT(26) - 1;
+		shift = 0;
+		break;
+	case AARCH64_INSN_IMM_19:
+		mask = BIT(19) - 1;
+		shift = 5;
+		break;
+	case AARCH64_INSN_IMM_16:
+		mask = BIT(16) - 1;
+		shift = 5;
+		break;
+	case AARCH64_INSN_IMM_14:
+		mask = BIT(14) - 1;
+		shift = 5;
+		break;
+	case AARCH64_INSN_IMM_12:
+		mask = BIT(12) - 1;
+		shift = 10;
+		break;
+	case AARCH64_INSN_IMM_9:
+		mask = BIT(9) - 1;
+		shift = 12;
+		break;
+	default:
+		pr_err("aarch64_insn_encode_immediate: unknown immediate encoding %d\n",
+			type);
+		return 0;
+	}
+
+	/* Update the immediate field. */
+	insn &= ~(mask << shift);
+	insn |= (imm & mask) << shift;
+
+	return insn;
+}
diff --git a/arch/arm64/kernel/module.c b/arch/arm64/kernel/module.c
index ca0e3d5..b02d723 100644
--- a/arch/arm64/kernel/module.c
+++ b/arch/arm64/kernel/module.c
@@ -25,6 +25,10 @@
 #include <linux/mm.h>
 #include <linux/moduleloader.h>
 #include <linux/vmalloc.h>
+#include <asm/insn.h>
+
+#define	AARCH64_INSN_IMM_MOVNZ		AARCH64_INSN_IMM_MAX
+#define	AARCH64_INSN_IMM_MOVK		AARCH64_INSN_IMM_16
 
 void *module_alloc(unsigned long size)
 {
@@ -94,25 +98,19 @@ static int reloc_data(enum aarch64_reloc_op op, void *place, u64 val, int len)
 	return 0;
 }
 
-enum aarch64_imm_type {
-	INSN_IMM_MOVNZ,
-	INSN_IMM_MOVK,
-	INSN_IMM_ADR,
-	INSN_IMM_26,
-	INSN_IMM_19,
-	INSN_IMM_16,
-	INSN_IMM_14,
-	INSN_IMM_12,
-	INSN_IMM_9,
-};
-
-static u32 encode_insn_immediate(enum aarch64_imm_type type, u32 insn, u64 imm)
+static int reloc_insn_movw(enum aarch64_reloc_op op, void *place, u64 val,
+			   int lsb, enum aarch64_insn_imm_type imm_type)
 {
-	u32 immlo, immhi, lomask, himask, mask;
-	int shift;
+	u64 imm, limit = 0;
+	s64 sval;
+	u32 insn = *(u32 *)place;
+
+	sval = do_reloc(op, place, val);
+	sval >>= lsb;
+	imm = sval & 0xffff;
+
 
-	switch (type) {
-	case INSN_IMM_MOVNZ:
+	if (imm_type == AARCH64_INSN_IMM_MOVNZ) {
 		/*
 		 * For signed MOVW relocations, we have to manipulate the
 		 * instruction encoding depending on whether or not the
@@ -131,70 +129,11 @@ static u32 encode_insn_immediate(enum aarch64_imm_type type, u32 insn, u64 imm)
 			 */
 			imm = ~imm;
 		}
-	case INSN_IMM_MOVK:
-		mask = BIT(16) - 1;
-		shift = 5;
-		break;
-	case INSN_IMM_ADR:
-		lomask = 0x3;
-		himask = 0x7ffff;
-		immlo = imm & lomask;
-		imm >>= 2;
-		immhi = imm & himask;
-		imm = (immlo << 24) | (immhi);
-		mask = (lomask << 24) | (himask);
-		shift = 5;
-		break;
-	case INSN_IMM_26:
-		mask = BIT(26) - 1;
-		shift = 0;
-		break;
-	case INSN_IMM_19:
-		mask = BIT(19) - 1;
-		shift = 5;
-		break;
-	case INSN_IMM_16:
-		mask = BIT(16) - 1;
-		shift = 5;
-		break;
-	case INSN_IMM_14:
-		mask = BIT(14) - 1;
-		shift = 5;
-		break;
-	case INSN_IMM_12:
-		mask = BIT(12) - 1;
-		shift = 10;
-		break;
-	case INSN_IMM_9:
-		mask = BIT(9) - 1;
-		shift = 12;
-		break;
-	default:
-		pr_err("encode_insn_immediate: unknown immediate encoding %d\n",
-			type);
-		return 0;
+		imm_type = AARCH64_INSN_IMM_MOVK;
 	}
 
-	/* Update the immediate field. */
-	insn &= ~(mask << shift);
-	insn |= (imm & mask) << shift;
-
-	return insn;
-}
-
-static int reloc_insn_movw(enum aarch64_reloc_op op, void *place, u64 val,
-			   int lsb, enum aarch64_imm_type imm_type)
-{
-	u64 imm, limit = 0;
-	s64 sval;
-	u32 insn = *(u32 *)place;
-
-	sval = do_reloc(op, place, val);
-	sval >>= lsb;
-	imm = sval & 0xffff;
-
 	/* Update the instruction with the new encoding. */
-	*(u32 *)place = encode_insn_immediate(imm_type, insn, imm);
+	*(u32 *)place = aarch64_insn_encode_immediate(imm_type, insn, imm);
 
 	/* Shift out the immediate field. */
 	sval >>= 16;
@@ -203,9 +142,9 @@ static int reloc_insn_movw(enum aarch64_reloc_op op, void *place, u64 val,
 	 * For unsigned immediates, the overflow check is straightforward.
 	 * For signed immediates, the sign bit is actually the bit past the
 	 * most significant bit of the field.
-	 * The INSN_IMM_16 immediate type is unsigned.
+	 * The AARCH64_INSN_IMM_16 immediate type is unsigned.
 	 */
-	if (imm_type != INSN_IMM_16) {
+	if (imm_type != AARCH64_INSN_IMM_16) {
 		sval++;
 		limit++;
 	}
@@ -218,7 +157,7 @@ static int reloc_insn_movw(enum aarch64_reloc_op op, void *place, u64 val,
 }
 
 static int reloc_insn_imm(enum aarch64_reloc_op op, void *place, u64 val,
-			  int lsb, int len, enum aarch64_imm_type imm_type)
+			  int lsb, int len, enum aarch64_insn_imm_type imm_type)
 {
 	u64 imm, imm_mask;
 	s64 sval;
@@ -233,7 +172,7 @@ static int reloc_insn_imm(enum aarch64_reloc_op op, void *place, u64 val,
 	imm = sval & imm_mask;
 
 	/* Update the instruction's immediate field. */
-	*(u32 *)place = encode_insn_immediate(imm_type, insn, imm);
+	*(u32 *)place = aarch64_insn_encode_immediate(imm_type, insn, imm);
 
 	/*
 	 * Extract the upper value bits (including the sign bit) and
@@ -315,125 +254,125 @@ int apply_relocate_add(Elf64_Shdr *sechdrs,
 			overflow_check = false;
 		case R_AARCH64_MOVW_UABS_G0:
 			ovf = reloc_insn_movw(RELOC_OP_ABS, loc, val, 0,
-					      INSN_IMM_16);
+					      AARCH64_INSN_IMM_16);
 			break;
 		case R_AARCH64_MOVW_UABS_G1_NC:
 			overflow_check = false;
 		case R_AARCH64_MOVW_UABS_G1:
 			ovf = reloc_insn_movw(RELOC_OP_ABS, loc, val, 16,
-					      INSN_IMM_16);
+					      AARCH64_INSN_IMM_16);
 			break;
 		case R_AARCH64_MOVW_UABS_G2_NC:
 			overflow_check = false;
 		case R_AARCH64_MOVW_UABS_G2:
 			ovf = reloc_insn_movw(RELOC_OP_ABS, loc, val, 32,
-					      INSN_IMM_16);
+					      AARCH64_INSN_IMM_16);
 			break;
 		case R_AARCH64_MOVW_UABS_G3:
 			/* We're using the top bits so we can't overflow. */
 			overflow_check = false;
 			ovf = reloc_insn_movw(RELOC_OP_ABS, loc, val, 48,
-					      INSN_IMM_16);
+					      AARCH64_INSN_IMM_16);
 			break;
 		case R_AARCH64_MOVW_SABS_G0:
 			ovf = reloc_insn_movw(RELOC_OP_ABS, loc, val, 0,
-					      INSN_IMM_MOVNZ);
+					      AARCH64_INSN_IMM_MOVNZ);
 			break;
 		case R_AARCH64_MOVW_SABS_G1:
 			ovf = reloc_insn_movw(RELOC_OP_ABS, loc, val, 16,
-					      INSN_IMM_MOVNZ);
+					      AARCH64_INSN_IMM_MOVNZ);
 			break;
 		case R_AARCH64_MOVW_SABS_G2:
 			ovf = reloc_insn_movw(RELOC_OP_ABS, loc, val, 32,
-					      INSN_IMM_MOVNZ);
+					      AARCH64_INSN_IMM_MOVNZ);
 			break;
 		case R_AARCH64_MOVW_PREL_G0_NC:
 			overflow_check = false;
 			ovf = reloc_insn_movw(RELOC_OP_PREL, loc, val, 0,
-					      INSN_IMM_MOVK);
+					      AARCH64_INSN_IMM_MOVK);
 			break;
 		case R_AARCH64_MOVW_PREL_G0:
 			ovf = reloc_insn_movw(RELOC_OP_PREL, loc, val, 0,
-					      INSN_IMM_MOVNZ);
+					      AARCH64_INSN_IMM_MOVNZ);
 			break;
 		case R_AARCH64_MOVW_PREL_G1_NC:
 			overflow_check = false;
 			ovf = reloc_insn_movw(RELOC_OP_PREL, loc, val, 16,
-					      INSN_IMM_MOVK);
+					      AARCH64_INSN_IMM_MOVK);
 			break;
 		case R_AARCH64_MOVW_PREL_G1:
 			ovf = reloc_insn_movw(RELOC_OP_PREL, loc, val, 16,
-					      INSN_IMM_MOVNZ);
+					      AARCH64_INSN_IMM_MOVNZ);
 			break;
 		case R_AARCH64_MOVW_PREL_G2_NC:
 			overflow_check = false;
 			ovf = reloc_insn_movw(RELOC_OP_PREL, loc, val, 32,
-					      INSN_IMM_MOVK);
+					      AARCH64_INSN_IMM_MOVK);
 			break;
 		case R_AARCH64_MOVW_PREL_G2:
 			ovf = reloc_insn_movw(RELOC_OP_PREL, loc, val, 32,
-					      INSN_IMM_MOVNZ);
+					      AARCH64_INSN_IMM_MOVNZ);
 			break;
 		case R_AARCH64_MOVW_PREL_G3:
 			/* We're using the top bits so we can't overflow. */
 			overflow_check = false;
 			ovf = reloc_insn_movw(RELOC_OP_PREL, loc, val, 48,
-					      INSN_IMM_MOVNZ);
+					      AARCH64_INSN_IMM_MOVNZ);
 			break;
 
 		/* Immediate instruction relocations. */
 		case R_AARCH64_LD_PREL_LO19:
 			ovf = reloc_insn_imm(RELOC_OP_PREL, loc, val, 2, 19,
-					     INSN_IMM_19);
+					     AARCH64_INSN_IMM_19);
 			break;
 		case R_AARCH64_ADR_PREL_LO21:
 			ovf = reloc_insn_imm(RELOC_OP_PREL, loc, val, 0, 21,
-					     INSN_IMM_ADR);
+					     AARCH64_INSN_IMM_ADR);
 			break;
 		case R_AARCH64_ADR_PREL_PG_HI21_NC:
 			overflow_check = false;
 		case R_AARCH64_ADR_PREL_PG_HI21:
 			ovf = reloc_insn_imm(RELOC_OP_PAGE, loc, val, 12, 21,
-					     INSN_IMM_ADR);
+					     AARCH64_INSN_IMM_ADR);
 			break;
 		case R_AARCH64_ADD_ABS_LO12_NC:
 		case R_AARCH64_LDST8_ABS_LO12_NC:
 			overflow_check = false;
 			ovf = reloc_insn_imm(RELOC_OP_ABS, loc, val, 0, 12,
-					     INSN_IMM_12);
+					     AARCH64_INSN_IMM_12);
 			break;
 		case R_AARCH64_LDST16_ABS_LO12_NC:
 			overflow_check = false;
 			ovf = reloc_insn_imm(RELOC_OP_ABS, loc, val, 1, 11,
-					     INSN_IMM_12);
+					     AARCH64_INSN_IMM_12);
 			break;
 		case R_AARCH64_LDST32_ABS_LO12_NC:
 			overflow_check = false;
 			ovf = reloc_insn_imm(RELOC_OP_ABS, loc, val, 2, 10,
-					     INSN_IMM_12);
+					     AARCH64_INSN_IMM_12);
 			break;
 		case R_AARCH64_LDST64_ABS_LO12_NC:
 			overflow_check = false;
 			ovf = reloc_insn_imm(RELOC_OP_ABS, loc, val, 3, 9,
-					     INSN_IMM_12);
+					     AARCH64_INSN_IMM_12);
 			break;
 		case R_AARCH64_LDST128_ABS_LO12_NC:
 			overflow_check = false;
 			ovf = reloc_insn_imm(RELOC_OP_ABS, loc, val, 4, 8,
-					     INSN_IMM_12);
+					     AARCH64_INSN_IMM_12);
 			break;
 		case R_AARCH64_TSTBR14:
 			ovf = reloc_insn_imm(RELOC_OP_PREL, loc, val, 2, 14,
-					     INSN_IMM_14);
+					     AARCH64_INSN_IMM_14);
 			break;
 		case R_AARCH64_CONDBR19:
 			ovf = reloc_insn_imm(RELOC_OP_PREL, loc, val, 2, 19,
-					     INSN_IMM_19);
+					     AARCH64_INSN_IMM_19);
 			break;
 		case R_AARCH64_JUMP26:
 		case R_AARCH64_CALL26:
 			ovf = reloc_insn_imm(RELOC_OP_PREL, loc, val, 2, 26,
-					     INSN_IMM_26);
+					     AARCH64_INSN_IMM_26);
 			break;
 
 		default:
-- 
1.8.1.2


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v5 3/7] arm64: move encode_insn_immediate() from module.c to insn.c
@ 2013-10-18 15:19   ` Jiang Liu
  0 siblings, 0 replies; 54+ messages in thread
From: Jiang Liu @ 2013-10-18 15:19 UTC (permalink / raw)
  To: linux-arm-kernel

From: Jiang Liu <jiang.liu@huawei.com>

Function encode_insn_immediate() will be used by other instruction
manipulate related functions, so move it into insn.c and rename it
as aarch64_insn_encode_immediate().

Reviewed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: Jiang Liu <liuj97@gmail.com>
---
 arch/arm64/include/asm/insn.h |  13 ++++
 arch/arm64/kernel/insn.c      |  54 +++++++++++++++
 arch/arm64/kernel/module.c    | 153 +++++++++++++-----------------------------
 3 files changed, 113 insertions(+), 107 deletions(-)

diff --git a/arch/arm64/include/asm/insn.h b/arch/arm64/include/asm/insn.h
index 7a69491..8f94e48 100644
--- a/arch/arm64/include/asm/insn.h
+++ b/arch/arm64/include/asm/insn.h
@@ -42,6 +42,17 @@ enum aarch64_insn_encoding_class {
 					 * system instructions */
 };
 
+enum aarch64_insn_imm_type {
+	AARCH64_INSN_IMM_ADR,
+	AARCH64_INSN_IMM_26,
+	AARCH64_INSN_IMM_19,
+	AARCH64_INSN_IMM_16,
+	AARCH64_INSN_IMM_14,
+	AARCH64_INSN_IMM_12,
+	AARCH64_INSN_IMM_9,
+	AARCH64_INSN_IMM_MAX
+};
+
 #define	__AARCH64_INSN_FUNCS(abbr, mask, val)	\
 static __always_inline bool aarch64_insn_is_##abbr(u32 code) \
 { return (code & (mask)) == (val); }	\
@@ -86,6 +97,8 @@ static __always_inline void aarch64_insn_write(void *addr, u32 insn)
 }
 
 enum aarch64_insn_encoding_class aarch64_get_insn_class(u32 insn);
+u32 aarch64_insn_encode_immediate(enum aarch64_insn_imm_type type,
+				  u32 insn, u64 imm);
 bool aarch64_insn_hotpatch_safe(u32 old_insn, u32 new_insn);
 
 int aarch64_insn_patch_text_nosync(void *addr, u32 insn);
diff --git a/arch/arm64/kernel/insn.c b/arch/arm64/kernel/insn.c
index 3879db4..886821f 100644
--- a/arch/arm64/kernel/insn.c
+++ b/arch/arm64/kernel/insn.c
@@ -179,3 +179,57 @@ int __kprobes aarch64_insn_patch_text(void *addrs[], u32 insns[], int cnt)
 
 	return ret;
 }
+
+u32 aarch64_insn_encode_immediate(enum aarch64_insn_imm_type type,
+				  u32 insn, u64 imm)
+{
+	u32 immlo, immhi, lomask, himask, mask;
+	int shift;
+
+	switch (type) {
+	case AARCH64_INSN_IMM_ADR:
+		lomask = 0x3;
+		himask = 0x7ffff;
+		immlo = imm & lomask;
+		imm >>= 2;
+		immhi = imm & himask;
+		imm = (immlo << 24) | (immhi);
+		mask = (lomask << 24) | (himask);
+		shift = 5;
+		break;
+	case AARCH64_INSN_IMM_26:
+		mask = BIT(26) - 1;
+		shift = 0;
+		break;
+	case AARCH64_INSN_IMM_19:
+		mask = BIT(19) - 1;
+		shift = 5;
+		break;
+	case AARCH64_INSN_IMM_16:
+		mask = BIT(16) - 1;
+		shift = 5;
+		break;
+	case AARCH64_INSN_IMM_14:
+		mask = BIT(14) - 1;
+		shift = 5;
+		break;
+	case AARCH64_INSN_IMM_12:
+		mask = BIT(12) - 1;
+		shift = 10;
+		break;
+	case AARCH64_INSN_IMM_9:
+		mask = BIT(9) - 1;
+		shift = 12;
+		break;
+	default:
+		pr_err("aarch64_insn_encode_immediate: unknown immediate encoding %d\n",
+			type);
+		return 0;
+	}
+
+	/* Update the immediate field. */
+	insn &= ~(mask << shift);
+	insn |= (imm & mask) << shift;
+
+	return insn;
+}
diff --git a/arch/arm64/kernel/module.c b/arch/arm64/kernel/module.c
index ca0e3d5..b02d723 100644
--- a/arch/arm64/kernel/module.c
+++ b/arch/arm64/kernel/module.c
@@ -25,6 +25,10 @@
 #include <linux/mm.h>
 #include <linux/moduleloader.h>
 #include <linux/vmalloc.h>
+#include <asm/insn.h>
+
+#define	AARCH64_INSN_IMM_MOVNZ		AARCH64_INSN_IMM_MAX
+#define	AARCH64_INSN_IMM_MOVK		AARCH64_INSN_IMM_16
 
 void *module_alloc(unsigned long size)
 {
@@ -94,25 +98,19 @@ static int reloc_data(enum aarch64_reloc_op op, void *place, u64 val, int len)
 	return 0;
 }
 
-enum aarch64_imm_type {
-	INSN_IMM_MOVNZ,
-	INSN_IMM_MOVK,
-	INSN_IMM_ADR,
-	INSN_IMM_26,
-	INSN_IMM_19,
-	INSN_IMM_16,
-	INSN_IMM_14,
-	INSN_IMM_12,
-	INSN_IMM_9,
-};
-
-static u32 encode_insn_immediate(enum aarch64_imm_type type, u32 insn, u64 imm)
+static int reloc_insn_movw(enum aarch64_reloc_op op, void *place, u64 val,
+			   int lsb, enum aarch64_insn_imm_type imm_type)
 {
-	u32 immlo, immhi, lomask, himask, mask;
-	int shift;
+	u64 imm, limit = 0;
+	s64 sval;
+	u32 insn = *(u32 *)place;
+
+	sval = do_reloc(op, place, val);
+	sval >>= lsb;
+	imm = sval & 0xffff;
+
 
-	switch (type) {
-	case INSN_IMM_MOVNZ:
+	if (imm_type == AARCH64_INSN_IMM_MOVNZ) {
 		/*
 		 * For signed MOVW relocations, we have to manipulate the
 		 * instruction encoding depending on whether or not the
@@ -131,70 +129,11 @@ static u32 encode_insn_immediate(enum aarch64_imm_type type, u32 insn, u64 imm)
 			 */
 			imm = ~imm;
 		}
-	case INSN_IMM_MOVK:
-		mask = BIT(16) - 1;
-		shift = 5;
-		break;
-	case INSN_IMM_ADR:
-		lomask = 0x3;
-		himask = 0x7ffff;
-		immlo = imm & lomask;
-		imm >>= 2;
-		immhi = imm & himask;
-		imm = (immlo << 24) | (immhi);
-		mask = (lomask << 24) | (himask);
-		shift = 5;
-		break;
-	case INSN_IMM_26:
-		mask = BIT(26) - 1;
-		shift = 0;
-		break;
-	case INSN_IMM_19:
-		mask = BIT(19) - 1;
-		shift = 5;
-		break;
-	case INSN_IMM_16:
-		mask = BIT(16) - 1;
-		shift = 5;
-		break;
-	case INSN_IMM_14:
-		mask = BIT(14) - 1;
-		shift = 5;
-		break;
-	case INSN_IMM_12:
-		mask = BIT(12) - 1;
-		shift = 10;
-		break;
-	case INSN_IMM_9:
-		mask = BIT(9) - 1;
-		shift = 12;
-		break;
-	default:
-		pr_err("encode_insn_immediate: unknown immediate encoding %d\n",
-			type);
-		return 0;
+		imm_type = AARCH64_INSN_IMM_MOVK;
 	}
 
-	/* Update the immediate field. */
-	insn &= ~(mask << shift);
-	insn |= (imm & mask) << shift;
-
-	return insn;
-}
-
-static int reloc_insn_movw(enum aarch64_reloc_op op, void *place, u64 val,
-			   int lsb, enum aarch64_imm_type imm_type)
-{
-	u64 imm, limit = 0;
-	s64 sval;
-	u32 insn = *(u32 *)place;
-
-	sval = do_reloc(op, place, val);
-	sval >>= lsb;
-	imm = sval & 0xffff;
-
 	/* Update the instruction with the new encoding. */
-	*(u32 *)place = encode_insn_immediate(imm_type, insn, imm);
+	*(u32 *)place = aarch64_insn_encode_immediate(imm_type, insn, imm);
 
 	/* Shift out the immediate field. */
 	sval >>= 16;
@@ -203,9 +142,9 @@ static int reloc_insn_movw(enum aarch64_reloc_op op, void *place, u64 val,
 	 * For unsigned immediates, the overflow check is straightforward.
 	 * For signed immediates, the sign bit is actually the bit past the
 	 * most significant bit of the field.
-	 * The INSN_IMM_16 immediate type is unsigned.
+	 * The AARCH64_INSN_IMM_16 immediate type is unsigned.
 	 */
-	if (imm_type != INSN_IMM_16) {
+	if (imm_type != AARCH64_INSN_IMM_16) {
 		sval++;
 		limit++;
 	}
@@ -218,7 +157,7 @@ static int reloc_insn_movw(enum aarch64_reloc_op op, void *place, u64 val,
 }
 
 static int reloc_insn_imm(enum aarch64_reloc_op op, void *place, u64 val,
-			  int lsb, int len, enum aarch64_imm_type imm_type)
+			  int lsb, int len, enum aarch64_insn_imm_type imm_type)
 {
 	u64 imm, imm_mask;
 	s64 sval;
@@ -233,7 +172,7 @@ static int reloc_insn_imm(enum aarch64_reloc_op op, void *place, u64 val,
 	imm = sval & imm_mask;
 
 	/* Update the instruction's immediate field. */
-	*(u32 *)place = encode_insn_immediate(imm_type, insn, imm);
+	*(u32 *)place = aarch64_insn_encode_immediate(imm_type, insn, imm);
 
 	/*
 	 * Extract the upper value bits (including the sign bit) and
@@ -315,125 +254,125 @@ int apply_relocate_add(Elf64_Shdr *sechdrs,
 			overflow_check = false;
 		case R_AARCH64_MOVW_UABS_G0:
 			ovf = reloc_insn_movw(RELOC_OP_ABS, loc, val, 0,
-					      INSN_IMM_16);
+					      AARCH64_INSN_IMM_16);
 			break;
 		case R_AARCH64_MOVW_UABS_G1_NC:
 			overflow_check = false;
 		case R_AARCH64_MOVW_UABS_G1:
 			ovf = reloc_insn_movw(RELOC_OP_ABS, loc, val, 16,
-					      INSN_IMM_16);
+					      AARCH64_INSN_IMM_16);
 			break;
 		case R_AARCH64_MOVW_UABS_G2_NC:
 			overflow_check = false;
 		case R_AARCH64_MOVW_UABS_G2:
 			ovf = reloc_insn_movw(RELOC_OP_ABS, loc, val, 32,
-					      INSN_IMM_16);
+					      AARCH64_INSN_IMM_16);
 			break;
 		case R_AARCH64_MOVW_UABS_G3:
 			/* We're using the top bits so we can't overflow. */
 			overflow_check = false;
 			ovf = reloc_insn_movw(RELOC_OP_ABS, loc, val, 48,
-					      INSN_IMM_16);
+					      AARCH64_INSN_IMM_16);
 			break;
 		case R_AARCH64_MOVW_SABS_G0:
 			ovf = reloc_insn_movw(RELOC_OP_ABS, loc, val, 0,
-					      INSN_IMM_MOVNZ);
+					      AARCH64_INSN_IMM_MOVNZ);
 			break;
 		case R_AARCH64_MOVW_SABS_G1:
 			ovf = reloc_insn_movw(RELOC_OP_ABS, loc, val, 16,
-					      INSN_IMM_MOVNZ);
+					      AARCH64_INSN_IMM_MOVNZ);
 			break;
 		case R_AARCH64_MOVW_SABS_G2:
 			ovf = reloc_insn_movw(RELOC_OP_ABS, loc, val, 32,
-					      INSN_IMM_MOVNZ);
+					      AARCH64_INSN_IMM_MOVNZ);
 			break;
 		case R_AARCH64_MOVW_PREL_G0_NC:
 			overflow_check = false;
 			ovf = reloc_insn_movw(RELOC_OP_PREL, loc, val, 0,
-					      INSN_IMM_MOVK);
+					      AARCH64_INSN_IMM_MOVK);
 			break;
 		case R_AARCH64_MOVW_PREL_G0:
 			ovf = reloc_insn_movw(RELOC_OP_PREL, loc, val, 0,
-					      INSN_IMM_MOVNZ);
+					      AARCH64_INSN_IMM_MOVNZ);
 			break;
 		case R_AARCH64_MOVW_PREL_G1_NC:
 			overflow_check = false;
 			ovf = reloc_insn_movw(RELOC_OP_PREL, loc, val, 16,
-					      INSN_IMM_MOVK);
+					      AARCH64_INSN_IMM_MOVK);
 			break;
 		case R_AARCH64_MOVW_PREL_G1:
 			ovf = reloc_insn_movw(RELOC_OP_PREL, loc, val, 16,
-					      INSN_IMM_MOVNZ);
+					      AARCH64_INSN_IMM_MOVNZ);
 			break;
 		case R_AARCH64_MOVW_PREL_G2_NC:
 			overflow_check = false;
 			ovf = reloc_insn_movw(RELOC_OP_PREL, loc, val, 32,
-					      INSN_IMM_MOVK);
+					      AARCH64_INSN_IMM_MOVK);
 			break;
 		case R_AARCH64_MOVW_PREL_G2:
 			ovf = reloc_insn_movw(RELOC_OP_PREL, loc, val, 32,
-					      INSN_IMM_MOVNZ);
+					      AARCH64_INSN_IMM_MOVNZ);
 			break;
 		case R_AARCH64_MOVW_PREL_G3:
 			/* We're using the top bits so we can't overflow. */
 			overflow_check = false;
 			ovf = reloc_insn_movw(RELOC_OP_PREL, loc, val, 48,
-					      INSN_IMM_MOVNZ);
+					      AARCH64_INSN_IMM_MOVNZ);
 			break;
 
 		/* Immediate instruction relocations. */
 		case R_AARCH64_LD_PREL_LO19:
 			ovf = reloc_insn_imm(RELOC_OP_PREL, loc, val, 2, 19,
-					     INSN_IMM_19);
+					     AARCH64_INSN_IMM_19);
 			break;
 		case R_AARCH64_ADR_PREL_LO21:
 			ovf = reloc_insn_imm(RELOC_OP_PREL, loc, val, 0, 21,
-					     INSN_IMM_ADR);
+					     AARCH64_INSN_IMM_ADR);
 			break;
 		case R_AARCH64_ADR_PREL_PG_HI21_NC:
 			overflow_check = false;
 		case R_AARCH64_ADR_PREL_PG_HI21:
 			ovf = reloc_insn_imm(RELOC_OP_PAGE, loc, val, 12, 21,
-					     INSN_IMM_ADR);
+					     AARCH64_INSN_IMM_ADR);
 			break;
 		case R_AARCH64_ADD_ABS_LO12_NC:
 		case R_AARCH64_LDST8_ABS_LO12_NC:
 			overflow_check = false;
 			ovf = reloc_insn_imm(RELOC_OP_ABS, loc, val, 0, 12,
-					     INSN_IMM_12);
+					     AARCH64_INSN_IMM_12);
 			break;
 		case R_AARCH64_LDST16_ABS_LO12_NC:
 			overflow_check = false;
 			ovf = reloc_insn_imm(RELOC_OP_ABS, loc, val, 1, 11,
-					     INSN_IMM_12);
+					     AARCH64_INSN_IMM_12);
 			break;
 		case R_AARCH64_LDST32_ABS_LO12_NC:
 			overflow_check = false;
 			ovf = reloc_insn_imm(RELOC_OP_ABS, loc, val, 2, 10,
-					     INSN_IMM_12);
+					     AARCH64_INSN_IMM_12);
 			break;
 		case R_AARCH64_LDST64_ABS_LO12_NC:
 			overflow_check = false;
 			ovf = reloc_insn_imm(RELOC_OP_ABS, loc, val, 3, 9,
-					     INSN_IMM_12);
+					     AARCH64_INSN_IMM_12);
 			break;
 		case R_AARCH64_LDST128_ABS_LO12_NC:
 			overflow_check = false;
 			ovf = reloc_insn_imm(RELOC_OP_ABS, loc, val, 4, 8,
-					     INSN_IMM_12);
+					     AARCH64_INSN_IMM_12);
 			break;
 		case R_AARCH64_TSTBR14:
 			ovf = reloc_insn_imm(RELOC_OP_PREL, loc, val, 2, 14,
-					     INSN_IMM_14);
+					     AARCH64_INSN_IMM_14);
 			break;
 		case R_AARCH64_CONDBR19:
 			ovf = reloc_insn_imm(RELOC_OP_PREL, loc, val, 2, 19,
-					     INSN_IMM_19);
+					     AARCH64_INSN_IMM_19);
 			break;
 		case R_AARCH64_JUMP26:
 		case R_AARCH64_CALL26:
 			ovf = reloc_insn_imm(RELOC_OP_PREL, loc, val, 2, 26,
-					     INSN_IMM_26);
+					     AARCH64_INSN_IMM_26);
 			break;
 
 		default:
-- 
1.8.1.2

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v5 4/7] arm64: introduce aarch64_insn_gen_{nop|branch_imm}() helper functions
  2013-10-18 15:19 ` Jiang Liu
@ 2013-10-18 15:19   ` Jiang Liu
  -1 siblings, 0 replies; 54+ messages in thread
From: Jiang Liu @ 2013-10-18 15:19 UTC (permalink / raw)
  To: Steven Rostedt, Catalin Marinas, Will Deacon, Sandeepa Prabhu,
	Jiang Liu, linux-arm-kernel, linux-kernel
  Cc: Jiang Liu

From: Jiang Liu <jiang.liu@huawei.com>

Introduce aarch64_insn_gen_{nop|branch_imm}() helper functions, which
will be used to implement jump label on ARM64.

Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: Jiang Liu <liuj97@gmail.com>
---
 arch/arm64/include/asm/insn.h | 13 +++++++++++++
 arch/arm64/kernel/insn.c      | 28 ++++++++++++++++++++++++++++
 2 files changed, 41 insertions(+)

diff --git a/arch/arm64/include/asm/insn.h b/arch/arm64/include/asm/insn.h
index 8f94e48..ee4a60d 100644
--- a/arch/arm64/include/asm/insn.h
+++ b/arch/arm64/include/asm/insn.h
@@ -96,9 +96,22 @@ static __always_inline void aarch64_insn_write(void *addr, u32 insn)
 	*(u32 *)addr = cpu_to_le32(insn);
 }
 
+static __always_inline u32 aarch64_insn_gen_hint(enum aarch64_insn_hint_op op)
+{
+	return aarch64_insn_get_hint_value() | op;
+}
+
+static __always_inline u32 aarch64_insn_gen_nop(void)
+{
+	return aarch64_insn_gen_hint(AARCH64_INSN_HINT_NOP);
+}
+
 enum aarch64_insn_encoding_class aarch64_get_insn_class(u32 insn);
 u32 aarch64_insn_encode_immediate(enum aarch64_insn_imm_type type,
 				  u32 insn, u64 imm);
+u32 aarch64_insn_gen_branch_imm(unsigned long pc, unsigned long addr,
+				bool link);
+
 bool aarch64_insn_hotpatch_safe(u32 old_insn, u32 new_insn);
 
 int aarch64_insn_patch_text_nosync(void *addr, u32 insn);
diff --git a/arch/arm64/kernel/insn.c b/arch/arm64/kernel/insn.c
index 886821f..f7498cc 100644
--- a/arch/arm64/kernel/insn.c
+++ b/arch/arm64/kernel/insn.c
@@ -14,6 +14,7 @@
  * You should have received a copy of the GNU General Public License
  * along with this program.  If not, see <http://www.gnu.org/licenses/>.
  */
+#include <linux/bitops.h>
 #include <linux/compiler.h>
 #include <linux/kernel.h>
 #include <linux/smp.h>
@@ -233,3 +234,30 @@ u32 aarch64_insn_encode_immediate(enum aarch64_insn_imm_type type,
 
 	return insn;
 }
+
+u32 aarch64_insn_gen_branch_imm(unsigned long pc, unsigned long addr, bool link)
+{
+	u32 insn;
+	long offset;
+
+	/*
+	 * PC: A 64-bit Program Counter holding the address of the current
+	 * instruction. A64 instructions may be word-aligned.
+	 */
+	BUG_ON((pc & 0x3) || (addr & 0x3));
+
+	/*
+	 * B/BL support [-128M, 128M) offset
+	 * ARM64 virtual address arrangement garantees all kernel and module
+	 * texts are within +/-128M.
+	 */
+	offset = ((long)addr - (long)pc) >> 2;
+	BUG_ON(abs(offset) > BIT(25) || offset == BIT(25));
+
+	if (link)
+		insn = aarch64_insn_get_bl_value();
+	else
+		insn = aarch64_insn_get_b_value();
+
+	return aarch64_insn_encode_immediate(AARCH64_INSN_IMM_26, insn, offset);
+}
-- 
1.8.1.2


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v5 4/7] arm64: introduce aarch64_insn_gen_{nop|branch_imm}() helper functions
@ 2013-10-18 15:19   ` Jiang Liu
  0 siblings, 0 replies; 54+ messages in thread
From: Jiang Liu @ 2013-10-18 15:19 UTC (permalink / raw)
  To: linux-arm-kernel

From: Jiang Liu <jiang.liu@huawei.com>

Introduce aarch64_insn_gen_{nop|branch_imm}() helper functions, which
will be used to implement jump label on ARM64.

Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: Jiang Liu <liuj97@gmail.com>
---
 arch/arm64/include/asm/insn.h | 13 +++++++++++++
 arch/arm64/kernel/insn.c      | 28 ++++++++++++++++++++++++++++
 2 files changed, 41 insertions(+)

diff --git a/arch/arm64/include/asm/insn.h b/arch/arm64/include/asm/insn.h
index 8f94e48..ee4a60d 100644
--- a/arch/arm64/include/asm/insn.h
+++ b/arch/arm64/include/asm/insn.h
@@ -96,9 +96,22 @@ static __always_inline void aarch64_insn_write(void *addr, u32 insn)
 	*(u32 *)addr = cpu_to_le32(insn);
 }
 
+static __always_inline u32 aarch64_insn_gen_hint(enum aarch64_insn_hint_op op)
+{
+	return aarch64_insn_get_hint_value() | op;
+}
+
+static __always_inline u32 aarch64_insn_gen_nop(void)
+{
+	return aarch64_insn_gen_hint(AARCH64_INSN_HINT_NOP);
+}
+
 enum aarch64_insn_encoding_class aarch64_get_insn_class(u32 insn);
 u32 aarch64_insn_encode_immediate(enum aarch64_insn_imm_type type,
 				  u32 insn, u64 imm);
+u32 aarch64_insn_gen_branch_imm(unsigned long pc, unsigned long addr,
+				bool link);
+
 bool aarch64_insn_hotpatch_safe(u32 old_insn, u32 new_insn);
 
 int aarch64_insn_patch_text_nosync(void *addr, u32 insn);
diff --git a/arch/arm64/kernel/insn.c b/arch/arm64/kernel/insn.c
index 886821f..f7498cc 100644
--- a/arch/arm64/kernel/insn.c
+++ b/arch/arm64/kernel/insn.c
@@ -14,6 +14,7 @@
  * You should have received a copy of the GNU General Public License
  * along with this program.  If not, see <http://www.gnu.org/licenses/>.
  */
+#include <linux/bitops.h>
 #include <linux/compiler.h>
 #include <linux/kernel.h>
 #include <linux/smp.h>
@@ -233,3 +234,30 @@ u32 aarch64_insn_encode_immediate(enum aarch64_insn_imm_type type,
 
 	return insn;
 }
+
+u32 aarch64_insn_gen_branch_imm(unsigned long pc, unsigned long addr, bool link)
+{
+	u32 insn;
+	long offset;
+
+	/*
+	 * PC: A 64-bit Program Counter holding the address of the current
+	 * instruction. A64 instructions may be word-aligned.
+	 */
+	BUG_ON((pc & 0x3) || (addr & 0x3));
+
+	/*
+	 * B/BL support [-128M, 128M) offset
+	 * ARM64 virtual address arrangement garantees all kernel and module
+	 * texts are within +/-128M.
+	 */
+	offset = ((long)addr - (long)pc) >> 2;
+	BUG_ON(abs(offset) > BIT(25) || offset == BIT(25));
+
+	if (link)
+		insn = aarch64_insn_get_bl_value();
+	else
+		insn = aarch64_insn_get_b_value();
+
+	return aarch64_insn_encode_immediate(AARCH64_INSN_IMM_26, insn, offset);
+}
-- 
1.8.1.2

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v5 5/7] arm64, jump label: detect %c support for ARM64
  2013-10-18 15:19 ` Jiang Liu
@ 2013-10-18 15:19   ` Jiang Liu
  -1 siblings, 0 replies; 54+ messages in thread
From: Jiang Liu @ 2013-10-18 15:19 UTC (permalink / raw)
  To: Steven Rostedt, Catalin Marinas, Will Deacon, Sandeepa Prabhu,
	Jiang Liu, linux-kernel
  Cc: linux-arm-kernel, Jiang Liu

From: Jiang Liu <jiang.liu@huawei.com>

As commit a9468f30b5eac6 "ARM: 7333/2: jump label: detect %c
support for ARM", this patch detects the same thing for ARM64
because some ARM64 GCC versions have the same issue.

Some versions of ARM64 GCC which do support asm goto, do not
support the %c specifier. Since we need the %c to support jump
labels on ARM64, detect that too in the asm goto detection script
to avoid build errors with these versions.

Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: Jiang Liu <liuj97@gmail.com>
---
 scripts/gcc-goto.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/scripts/gcc-goto.sh b/scripts/gcc-goto.sh
index a2af2e8..c9469d3 100644
--- a/scripts/gcc-goto.sh
+++ b/scripts/gcc-goto.sh
@@ -5,7 +5,7 @@
 cat << "END" | $@ -x c - -c -o /dev/null >/dev/null 2>&1 && echo "y"
 int main(void)
 {
-#ifdef __arm__
+#if defined(__arm__) || defined(__aarch64__)
 	/*
 	 * Not related to asm goto, but used by jump label
 	 * and broken on some ARM GCC versions (see GCC Bug 48637).
-- 
1.8.1.2


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v5 5/7] arm64, jump label: detect %c support for ARM64
@ 2013-10-18 15:19   ` Jiang Liu
  0 siblings, 0 replies; 54+ messages in thread
From: Jiang Liu @ 2013-10-18 15:19 UTC (permalink / raw)
  To: linux-arm-kernel

From: Jiang Liu <jiang.liu@huawei.com>

As commit a9468f30b5eac6 "ARM: 7333/2: jump label: detect %c
support for ARM", this patch detects the same thing for ARM64
because some ARM64 GCC versions have the same issue.

Some versions of ARM64 GCC which do support asm goto, do not
support the %c specifier. Since we need the %c to support jump
labels on ARM64, detect that too in the asm goto detection script
to avoid build errors with these versions.

Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: Jiang Liu <liuj97@gmail.com>
---
 scripts/gcc-goto.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/scripts/gcc-goto.sh b/scripts/gcc-goto.sh
index a2af2e8..c9469d3 100644
--- a/scripts/gcc-goto.sh
+++ b/scripts/gcc-goto.sh
@@ -5,7 +5,7 @@
 cat << "END" | $@ -x c - -c -o /dev/null >/dev/null 2>&1 && echo "y"
 int main(void)
 {
-#ifdef __arm__
+#if defined(__arm__) || defined(__aarch64__)
 	/*
 	 * Not related to asm goto, but used by jump label
 	 * and broken on some ARM GCC versions (see GCC Bug 48637).
-- 
1.8.1.2

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v5 6/7] arm64, jump label: optimize jump label implementation
  2013-10-18 15:19 ` Jiang Liu
@ 2013-10-18 15:20   ` Jiang Liu
  -1 siblings, 0 replies; 54+ messages in thread
From: Jiang Liu @ 2013-10-18 15:20 UTC (permalink / raw)
  To: Steven Rostedt, Catalin Marinas, Will Deacon, Sandeepa Prabhu,
	Jiang Liu, Marc Zyngier, Arnd Bergmann, linux-arm-kernel,
	linux-kernel
  Cc: Jiang Liu

From: Jiang Liu <jiang.liu@huawei.com>

Optimize jump label implementation for ARM64 by dynamically patching
kernel text.

Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: Jiang Liu <liuj97@gmail.com>
---
 arch/arm64/Kconfig                  |  1 +
 arch/arm64/include/asm/jump_label.h | 52 +++++++++++++++++++++++++++++++++
 arch/arm64/kernel/Makefile          |  1 +
 arch/arm64/kernel/jump_label.c      | 58 +++++++++++++++++++++++++++++++++++++
 4 files changed, 112 insertions(+)
 create mode 100644 arch/arm64/include/asm/jump_label.h
 create mode 100644 arch/arm64/kernel/jump_label.c

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index c044548..da388e4 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -17,6 +17,7 @@ config ARM64
 	select GENERIC_SMP_IDLE_THREAD
 	select GENERIC_TIME_VSYSCALL
 	select HARDIRQS_SW_RESEND
+	select HAVE_ARCH_JUMP_LABEL
 	select HAVE_ARCH_TRACEHOOK
 	select HAVE_DEBUG_BUGVERBOSE
 	select HAVE_DEBUG_KMEMLEAK
diff --git a/arch/arm64/include/asm/jump_label.h b/arch/arm64/include/asm/jump_label.h
new file mode 100644
index 0000000..d268fab
--- /dev/null
+++ b/arch/arm64/include/asm/jump_label.h
@@ -0,0 +1,52 @@
+/*
+ * Copyright (C) 2013 Huawei Ltd.
+ * Author: Jiang Liu <jiang.liu@huawei.com>
+ *
+ * Based on arch/arm/include/asm/jump_label.h
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef _ASM_ARM64_JUMP_LABEL_H
+#define _ASM_ARM64_JUMP_LABEL_H
+#include <linux/types.h>
+
+#ifdef __KERNEL__
+
+#define JUMP_LABEL_NOP_SIZE 4
+
+static __always_inline bool arch_static_branch(struct static_key *key)
+{
+	asm goto("1:\n\t"
+		 "nop\n\t"
+		 ".pushsection __jump_table,  \"aw\"\n\t"
+		 ".align 3\n\t"
+		 ".quad 1b, %l[l_yes], %c0\n\t"
+		 ".popsection\n\t"
+		 :  :  "i"(key) :  : l_yes);
+
+	return false;
+l_yes:
+	return true;
+}
+
+#endif /* __KERNEL__ */
+
+typedef u64 jump_label_t;
+
+struct jump_entry {
+	jump_label_t code;
+	jump_label_t target;
+	jump_label_t key;
+};
+
+#endif	/* _ASM_ARM64_JUMP_LABEL_H */
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index 9af6cb3..b7db65e 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -18,6 +18,7 @@ arm64-obj-$(CONFIG_SMP)			+= smp.o smp_spin_table.o smp_psci.o
 arm64-obj-$(CONFIG_HW_PERF_EVENTS)	+= perf_event.o
 arm64-obj-$(CONFIG_HAVE_HW_BREAKPOINT)+= hw_breakpoint.o
 arm64-obj-$(CONFIG_EARLY_PRINTK)	+= early_printk.o
+arm64-obj-$(CONFIG_JUMP_LABEL)		+= jump_label.o
 
 obj-y					+= $(arm64-obj-y) vdso/
 obj-m					+= $(arm64-obj-m)
diff --git a/arch/arm64/kernel/jump_label.c b/arch/arm64/kernel/jump_label.c
new file mode 100644
index 0000000..871786a
--- /dev/null
+++ b/arch/arm64/kernel/jump_label.c
@@ -0,0 +1,58 @@
+/*
+ * Copyright (C) 2013 Huawei Ltd.
+ * Author: Jiang Liu <jiang.liu@huawei.com>
+ *
+ * Based on arch/arm/kernel/jump_label.c
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#include <linux/kernel.h>
+#include <linux/jump_label.h>
+#include <asm/jump_label.h>
+#include <asm/insn.h>
+
+#ifdef HAVE_JUMP_LABEL
+
+static void __arch_jump_label_transform(struct jump_entry *entry,
+					enum jump_label_type type,
+					bool is_static)
+{
+	void *addr = (void *)entry->code;
+	u32 insn;
+
+	if (type == JUMP_LABEL_ENABLE) {
+		insn = aarch64_insn_gen_branch_imm(entry->code,
+						   entry->target, 0);
+	} else {
+		insn = aarch64_insn_gen_nop();
+	}
+
+	if (is_static)
+		aarch64_insn_patch_text_nosync(&addr, &insn, 1);
+	else
+		aarch64_insn_patch_text(&addr, &insn, 1);
+}
+
+void arch_jump_label_transform(struct jump_entry *entry,
+			       enum jump_label_type type)
+{
+	__arch_jump_label_transform(entry, type, false);
+}
+
+void arch_jump_label_transform_static(struct jump_entry *entry,
+				      enum jump_label_type type)
+{
+	__arch_jump_label_transform(entry, type, true);
+}
+
+#endif	/* HAVE_JUMP_LABEL */
-- 
1.8.1.2


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v5 6/7] arm64, jump label: optimize jump label implementation
@ 2013-10-18 15:20   ` Jiang Liu
  0 siblings, 0 replies; 54+ messages in thread
From: Jiang Liu @ 2013-10-18 15:20 UTC (permalink / raw)
  To: linux-arm-kernel

From: Jiang Liu <jiang.liu@huawei.com>

Optimize jump label implementation for ARM64 by dynamically patching
kernel text.

Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: Jiang Liu <liuj97@gmail.com>
---
 arch/arm64/Kconfig                  |  1 +
 arch/arm64/include/asm/jump_label.h | 52 +++++++++++++++++++++++++++++++++
 arch/arm64/kernel/Makefile          |  1 +
 arch/arm64/kernel/jump_label.c      | 58 +++++++++++++++++++++++++++++++++++++
 4 files changed, 112 insertions(+)
 create mode 100644 arch/arm64/include/asm/jump_label.h
 create mode 100644 arch/arm64/kernel/jump_label.c

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index c044548..da388e4 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -17,6 +17,7 @@ config ARM64
 	select GENERIC_SMP_IDLE_THREAD
 	select GENERIC_TIME_VSYSCALL
 	select HARDIRQS_SW_RESEND
+	select HAVE_ARCH_JUMP_LABEL
 	select HAVE_ARCH_TRACEHOOK
 	select HAVE_DEBUG_BUGVERBOSE
 	select HAVE_DEBUG_KMEMLEAK
diff --git a/arch/arm64/include/asm/jump_label.h b/arch/arm64/include/asm/jump_label.h
new file mode 100644
index 0000000..d268fab
--- /dev/null
+++ b/arch/arm64/include/asm/jump_label.h
@@ -0,0 +1,52 @@
+/*
+ * Copyright (C) 2013 Huawei Ltd.
+ * Author: Jiang Liu <jiang.liu@huawei.com>
+ *
+ * Based on arch/arm/include/asm/jump_label.h
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef _ASM_ARM64_JUMP_LABEL_H
+#define _ASM_ARM64_JUMP_LABEL_H
+#include <linux/types.h>
+
+#ifdef __KERNEL__
+
+#define JUMP_LABEL_NOP_SIZE 4
+
+static __always_inline bool arch_static_branch(struct static_key *key)
+{
+	asm goto("1:\n\t"
+		 "nop\n\t"
+		 ".pushsection __jump_table,  \"aw\"\n\t"
+		 ".align 3\n\t"
+		 ".quad 1b, %l[l_yes], %c0\n\t"
+		 ".popsection\n\t"
+		 :  :  "i"(key) :  : l_yes);
+
+	return false;
+l_yes:
+	return true;
+}
+
+#endif /* __KERNEL__ */
+
+typedef u64 jump_label_t;
+
+struct jump_entry {
+	jump_label_t code;
+	jump_label_t target;
+	jump_label_t key;
+};
+
+#endif	/* _ASM_ARM64_JUMP_LABEL_H */
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index 9af6cb3..b7db65e 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -18,6 +18,7 @@ arm64-obj-$(CONFIG_SMP)			+= smp.o smp_spin_table.o smp_psci.o
 arm64-obj-$(CONFIG_HW_PERF_EVENTS)	+= perf_event.o
 arm64-obj-$(CONFIG_HAVE_HW_BREAKPOINT)+= hw_breakpoint.o
 arm64-obj-$(CONFIG_EARLY_PRINTK)	+= early_printk.o
+arm64-obj-$(CONFIG_JUMP_LABEL)		+= jump_label.o
 
 obj-y					+= $(arm64-obj-y) vdso/
 obj-m					+= $(arm64-obj-m)
diff --git a/arch/arm64/kernel/jump_label.c b/arch/arm64/kernel/jump_label.c
new file mode 100644
index 0000000..871786a
--- /dev/null
+++ b/arch/arm64/kernel/jump_label.c
@@ -0,0 +1,58 @@
+/*
+ * Copyright (C) 2013 Huawei Ltd.
+ * Author: Jiang Liu <jiang.liu@huawei.com>
+ *
+ * Based on arch/arm/kernel/jump_label.c
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#include <linux/kernel.h>
+#include <linux/jump_label.h>
+#include <asm/jump_label.h>
+#include <asm/insn.h>
+
+#ifdef HAVE_JUMP_LABEL
+
+static void __arch_jump_label_transform(struct jump_entry *entry,
+					enum jump_label_type type,
+					bool is_static)
+{
+	void *addr = (void *)entry->code;
+	u32 insn;
+
+	if (type == JUMP_LABEL_ENABLE) {
+		insn = aarch64_insn_gen_branch_imm(entry->code,
+						   entry->target, 0);
+	} else {
+		insn = aarch64_insn_gen_nop();
+	}
+
+	if (is_static)
+		aarch64_insn_patch_text_nosync(&addr, &insn, 1);
+	else
+		aarch64_insn_patch_text(&addr, &insn, 1);
+}
+
+void arch_jump_label_transform(struct jump_entry *entry,
+			       enum jump_label_type type)
+{
+	__arch_jump_label_transform(entry, type, false);
+}
+
+void arch_jump_label_transform_static(struct jump_entry *entry,
+				      enum jump_label_type type)
+{
+	__arch_jump_label_transform(entry, type, true);
+}
+
+#endif	/* HAVE_JUMP_LABEL */
-- 
1.8.1.2

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v5 7/7] jump_label: use defined macros instead of hard-coding for better readability
  2013-10-18 15:19 ` Jiang Liu
@ 2013-10-18 15:20   ` Jiang Liu
  -1 siblings, 0 replies; 54+ messages in thread
From: Jiang Liu @ 2013-10-18 15:20 UTC (permalink / raw)
  To: Steven Rostedt, Catalin Marinas, Will Deacon, Sandeepa Prabhu,
	Jiang Liu, Andrew Jones, Raghavendra K T, Konrad Rzeszutek Wilk,
	H. Peter Anvin, linux-kernel
  Cc: linux-arm-kernel, Jiang Liu

From: Jiang Liu <jiang.liu@huawei.com>

Use macro JUMP_LABEL_TRUE_BRANCH instead of hard-coding for better
readability.

Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: Jiang Liu <liuj97@gmail.com>
---
 include/linux/jump_label.h | 15 ++++++++++-----
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/include/linux/jump_label.h b/include/linux/jump_label.h
index a507907..6e54029 100644
--- a/include/linux/jump_label.h
+++ b/include/linux/jump_label.h
@@ -74,18 +74,21 @@ struct module;
 #include <linux/atomic.h>
 #ifdef HAVE_JUMP_LABEL
 
-#define JUMP_LABEL_TRUE_BRANCH 1UL
+#define JUMP_LABEL_TYPE_FALSE_BRANCH	0UL
+#define JUMP_LABEL_TYPE_TRUE_BRANCH	1UL
+#define JUMP_LABEL_TYPE_MASK		1UL
 
 static
 inline struct jump_entry *jump_label_get_entries(struct static_key *key)
 {
 	return (struct jump_entry *)((unsigned long)key->entries
-						& ~JUMP_LABEL_TRUE_BRANCH);
+						& ~JUMP_LABEL_TYPE_MASK);
 }
 
 static inline bool jump_label_get_branch_default(struct static_key *key)
 {
-	if ((unsigned long)key->entries & JUMP_LABEL_TRUE_BRANCH)
+	if (((unsigned long)key->entries & JUMP_LABEL_TYPE_MASK) ==
+	    JUMP_LABEL_TYPE_TRUE_BRANCH)
 		return true;
 	return false;
 }
@@ -116,9 +119,11 @@ extern void static_key_slow_dec(struct static_key *key);
 extern void jump_label_apply_nops(struct module *mod);
 
 #define STATIC_KEY_INIT_TRUE ((struct static_key) \
-	{ .enabled = ATOMIC_INIT(1), .entries = (void *)1 })
+	{ .enabled = ATOMIC_INIT(1), \
+	  .entries = (void *)JUMP_LABEL_TYPE_TRUE_BRANCH })
 #define STATIC_KEY_INIT_FALSE ((struct static_key) \
-	{ .enabled = ATOMIC_INIT(0), .entries = (void *)0 })
+	{ .enabled = ATOMIC_INIT(0), \
+	  .entries = (void *)JUMP_LABEL_TYPE_FALSE_BRANCH })
 
 #else  /* !HAVE_JUMP_LABEL */
 
-- 
1.8.1.2


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v5 7/7] jump_label: use defined macros instead of hard-coding for better readability
@ 2013-10-18 15:20   ` Jiang Liu
  0 siblings, 0 replies; 54+ messages in thread
From: Jiang Liu @ 2013-10-18 15:20 UTC (permalink / raw)
  To: linux-arm-kernel

From: Jiang Liu <jiang.liu@huawei.com>

Use macro JUMP_LABEL_TRUE_BRANCH instead of hard-coding for better
readability.

Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: Jiang Liu <liuj97@gmail.com>
---
 include/linux/jump_label.h | 15 ++++++++++-----
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/include/linux/jump_label.h b/include/linux/jump_label.h
index a507907..6e54029 100644
--- a/include/linux/jump_label.h
+++ b/include/linux/jump_label.h
@@ -74,18 +74,21 @@ struct module;
 #include <linux/atomic.h>
 #ifdef HAVE_JUMP_LABEL
 
-#define JUMP_LABEL_TRUE_BRANCH 1UL
+#define JUMP_LABEL_TYPE_FALSE_BRANCH	0UL
+#define JUMP_LABEL_TYPE_TRUE_BRANCH	1UL
+#define JUMP_LABEL_TYPE_MASK		1UL
 
 static
 inline struct jump_entry *jump_label_get_entries(struct static_key *key)
 {
 	return (struct jump_entry *)((unsigned long)key->entries
-						& ~JUMP_LABEL_TRUE_BRANCH);
+						& ~JUMP_LABEL_TYPE_MASK);
 }
 
 static inline bool jump_label_get_branch_default(struct static_key *key)
 {
-	if ((unsigned long)key->entries & JUMP_LABEL_TRUE_BRANCH)
+	if (((unsigned long)key->entries & JUMP_LABEL_TYPE_MASK) ==
+	    JUMP_LABEL_TYPE_TRUE_BRANCH)
 		return true;
 	return false;
 }
@@ -116,9 +119,11 @@ extern void static_key_slow_dec(struct static_key *key);
 extern void jump_label_apply_nops(struct module *mod);
 
 #define STATIC_KEY_INIT_TRUE ((struct static_key) \
-	{ .enabled = ATOMIC_INIT(1), .entries = (void *)1 })
+	{ .enabled = ATOMIC_INIT(1), \
+	  .entries = (void *)JUMP_LABEL_TYPE_TRUE_BRANCH })
 #define STATIC_KEY_INIT_FALSE ((struct static_key) \
-	{ .enabled = ATOMIC_INIT(0), .entries = (void *)0 })
+	{ .enabled = ATOMIC_INIT(0), \
+	  .entries = (void *)JUMP_LABEL_TYPE_FALSE_BRANCH })
 
 #else  /* !HAVE_JUMP_LABEL */
 
-- 
1.8.1.2

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* Re: [PATCH v5 2/7] arm64: introduce interfaces to hotpatch kernel and module code
  2013-10-18 15:19   ` Jiang Liu
@ 2013-10-30  0:12     ` Will Deacon
  -1 siblings, 0 replies; 54+ messages in thread
From: Will Deacon @ 2013-10-30  0:12 UTC (permalink / raw)
  To: Jiang Liu
  Cc: Steven Rostedt, Catalin Marinas, Sandeepa Prabhu, Jiang Liu,
	linux-arm-kernel, linux-kernel

Hi Jinag Liu,

Sorry for the delayed review, I've been travelling.

On Fri, Oct 18, 2013 at 04:19:56PM +0100, Jiang Liu wrote:
> From: Jiang Liu <jiang.liu@huawei.com>

If I try and email you at your Huawei address, I get a bounce from the mail
server. Is that expected? If so, it's not very helpful from a commit log
perspective if you use that address as the author on your patches.

> Introduce three interfaces to patch kernel and module code:
> aarch64_insn_patch_text_nosync():
> 	patch code without synchronization, it's caller's responsibility
> 	to synchronize all CPUs if needed.
> aarch64_insn_patch_text_sync():
> 	patch code and always synchronize with stop_machine()
> aarch64_insn_patch_text():
> 	patch code and synchronize with stop_machine() if needed
> 
> Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
> Cc: Jiang Liu <liuj97@gmail.com>
> ---
>  arch/arm64/include/asm/insn.h | 19 ++++++++-
>  arch/arm64/kernel/insn.c      | 91 +++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 109 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/include/asm/insn.h b/arch/arm64/include/asm/insn.h
> index 7499490..7a69491 100644
> --- a/arch/arm64/include/asm/insn.h
> +++ b/arch/arm64/include/asm/insn.h
> @@ -71,8 +71,25 @@ enum aarch64_insn_hint_op {
>  
>  bool aarch64_insn_is_nop(u32 insn);
>  
> -enum aarch64_insn_encoding_class aarch64_get_insn_class(u32 insn);
> +/*
> + * In ARMv8-A, A64 instructions have a fixed length of 32 bits and are always
> + * little-endian.
> + */
> +static __always_inline u32 aarch64_insn_read(void *addr)
> +{
> +	return le32_to_cpu(*(u32 *)addr);
> +}
>  
> +static __always_inline void aarch64_insn_write(void *addr, u32 insn)
> +{
> +	*(u32 *)addr = cpu_to_le32(insn);
> +}

I wouldn't bother with these helpers. You should probably be using
probe_kernel_address or similar, then doing the endianness swabbing on the
return value in-line.

> +enum aarch64_insn_encoding_class aarch64_get_insn_class(u32 insn);
>  bool aarch64_insn_hotpatch_safe(u32 old_insn, u32 new_insn);
>  
> +int aarch64_insn_patch_text_nosync(void *addr, u32 insn);
> +int aarch64_insn_patch_text_sync(void *addrs[], u32 insns[], int cnt);
> +int aarch64_insn_patch_text(void *addrs[], u32 insns[], int cnt);
> +
>  #endif	/* __ASM_INSN_H */
> diff --git a/arch/arm64/kernel/insn.c b/arch/arm64/kernel/insn.c
> index f5b779f..3879db4 100644
> --- a/arch/arm64/kernel/insn.c
> +++ b/arch/arm64/kernel/insn.c
> @@ -16,6 +16,9 @@
>   */
>  #include <linux/compiler.h>
>  #include <linux/kernel.h>
> +#include <linux/smp.h>
> +#include <linux/stop_machine.h>
> +#include <asm/cacheflush.h>
>  #include <asm/insn.h>
>  
>  static int aarch64_insn_encoding_class[] = {
> @@ -88,3 +91,91 @@ bool __kprobes aarch64_insn_hotpatch_safe(u32 old_insn, u32 new_insn)
>  	return __aarch64_insn_hotpatch_safe(old_insn) &&
>  	       __aarch64_insn_hotpatch_safe(new_insn);
>  }
> +
> +int __kprobes aarch64_insn_patch_text_nosync(void *addr, u32 insn)
> +{
> +	u32 *tp = addr;
> +
> +	/* A64 instructions must be word aligned */
> +	if ((uintptr_t)tp & 0x3)
> +		return -EINVAL;
> +
> +	aarch64_insn_write(tp, insn);
> +	flush_icache_range((uintptr_t)tp, (uintptr_t)tp + sizeof(u32));

sizeof(insn) is clearer here.

> +
> +	return 0;
> +}
> +
> +struct aarch64_insn_patch {
> +	void	**text_addrs;
> +	u32	*new_insns;
> +	int	insn_cnt;
> +};
> +
> +static DEFINE_MUTEX(text_patch_lock);
> +static atomic_t text_patch_id;
> +
> +static int __kprobes aarch64_insn_patch_text_cb(void *arg)
> +{
> +	int i, ret = 0;
> +	struct aarch64_insn_patch *pp = arg;
> +
> +	if (atomic_read(&text_patch_id) == smp_processor_id()) {

You could actually pass in the test_patch_id as zero-initialised parameter
to this function (i.e. it points to something on the stack of the guy
issuing the stop_machine). Then you do an inc_return here. If you see zero,
you go ahead and do the patching.

> +		for (i = 0; ret == 0 && i < pp->insn_cnt; i++)
> +			ret = aarch64_insn_patch_text_nosync(pp->text_addrs[i],
> +							     pp->new_insns[i]);
> +		/* Let other CPU continue */
> +		atomic_set(&text_patch_id, -1);

You're relying on the barrier in flush_icache_range to order this
atomic_set. I think you should add a comment describing that, because it's
very subtle.

> +	} else {
> +		while (atomic_read(&text_patch_id) != -1)
> +			cpu_relax();
> +		isb();
> +	}
> +
> +	return ret;
> +}

I don't think you need the isb -- the exception return should do the trick
(but again, a comment would be useful).

> +
> +int __kprobes aarch64_insn_patch_text_sync(void *addrs[], u32 insns[], int cnt)
> +{
> +	int ret;
> +	struct aarch64_insn_patch patch = {
> +		.text_addrs = addrs,
> +		.new_insns = insns,
> +		.insn_cnt = cnt,
> +	};
> +
> +	if (cnt <= 0)
> +		return -EINVAL;
> +
> +	mutex_lock(&text_patch_lock);

Again, if you use a loacl variable, I don't think you need the mutex. What
do you think?

> +	atomic_set(&text_patch_id, smp_processor_id());
> +	ret = stop_machine(aarch64_insn_patch_text_cb, &patch, cpu_online_mask);

Instead of doing this, why not instead call aarch64_insn_patch_text_nosync
inline, then call kick_all_cpus_sync immediately afterwards, without the
need to stop_machine.

Will

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH v5 2/7] arm64: introduce interfaces to hotpatch kernel and module code
@ 2013-10-30  0:12     ` Will Deacon
  0 siblings, 0 replies; 54+ messages in thread
From: Will Deacon @ 2013-10-30  0:12 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Jinag Liu,

Sorry for the delayed review, I've been travelling.

On Fri, Oct 18, 2013 at 04:19:56PM +0100, Jiang Liu wrote:
> From: Jiang Liu <jiang.liu@huawei.com>

If I try and email you at your Huawei address, I get a bounce from the mail
server. Is that expected? If so, it's not very helpful from a commit log
perspective if you use that address as the author on your patches.

> Introduce three interfaces to patch kernel and module code:
> aarch64_insn_patch_text_nosync():
> 	patch code without synchronization, it's caller's responsibility
> 	to synchronize all CPUs if needed.
> aarch64_insn_patch_text_sync():
> 	patch code and always synchronize with stop_machine()
> aarch64_insn_patch_text():
> 	patch code and synchronize with stop_machine() if needed
> 
> Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
> Cc: Jiang Liu <liuj97@gmail.com>
> ---
>  arch/arm64/include/asm/insn.h | 19 ++++++++-
>  arch/arm64/kernel/insn.c      | 91 +++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 109 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/include/asm/insn.h b/arch/arm64/include/asm/insn.h
> index 7499490..7a69491 100644
> --- a/arch/arm64/include/asm/insn.h
> +++ b/arch/arm64/include/asm/insn.h
> @@ -71,8 +71,25 @@ enum aarch64_insn_hint_op {
>  
>  bool aarch64_insn_is_nop(u32 insn);
>  
> -enum aarch64_insn_encoding_class aarch64_get_insn_class(u32 insn);
> +/*
> + * In ARMv8-A, A64 instructions have a fixed length of 32 bits and are always
> + * little-endian.
> + */
> +static __always_inline u32 aarch64_insn_read(void *addr)
> +{
> +	return le32_to_cpu(*(u32 *)addr);
> +}
>  
> +static __always_inline void aarch64_insn_write(void *addr, u32 insn)
> +{
> +	*(u32 *)addr = cpu_to_le32(insn);
> +}

I wouldn't bother with these helpers. You should probably be using
probe_kernel_address or similar, then doing the endianness swabbing on the
return value in-line.

> +enum aarch64_insn_encoding_class aarch64_get_insn_class(u32 insn);
>  bool aarch64_insn_hotpatch_safe(u32 old_insn, u32 new_insn);
>  
> +int aarch64_insn_patch_text_nosync(void *addr, u32 insn);
> +int aarch64_insn_patch_text_sync(void *addrs[], u32 insns[], int cnt);
> +int aarch64_insn_patch_text(void *addrs[], u32 insns[], int cnt);
> +
>  #endif	/* __ASM_INSN_H */
> diff --git a/arch/arm64/kernel/insn.c b/arch/arm64/kernel/insn.c
> index f5b779f..3879db4 100644
> --- a/arch/arm64/kernel/insn.c
> +++ b/arch/arm64/kernel/insn.c
> @@ -16,6 +16,9 @@
>   */
>  #include <linux/compiler.h>
>  #include <linux/kernel.h>
> +#include <linux/smp.h>
> +#include <linux/stop_machine.h>
> +#include <asm/cacheflush.h>
>  #include <asm/insn.h>
>  
>  static int aarch64_insn_encoding_class[] = {
> @@ -88,3 +91,91 @@ bool __kprobes aarch64_insn_hotpatch_safe(u32 old_insn, u32 new_insn)
>  	return __aarch64_insn_hotpatch_safe(old_insn) &&
>  	       __aarch64_insn_hotpatch_safe(new_insn);
>  }
> +
> +int __kprobes aarch64_insn_patch_text_nosync(void *addr, u32 insn)
> +{
> +	u32 *tp = addr;
> +
> +	/* A64 instructions must be word aligned */
> +	if ((uintptr_t)tp & 0x3)
> +		return -EINVAL;
> +
> +	aarch64_insn_write(tp, insn);
> +	flush_icache_range((uintptr_t)tp, (uintptr_t)tp + sizeof(u32));

sizeof(insn) is clearer here.

> +
> +	return 0;
> +}
> +
> +struct aarch64_insn_patch {
> +	void	**text_addrs;
> +	u32	*new_insns;
> +	int	insn_cnt;
> +};
> +
> +static DEFINE_MUTEX(text_patch_lock);
> +static atomic_t text_patch_id;
> +
> +static int __kprobes aarch64_insn_patch_text_cb(void *arg)
> +{
> +	int i, ret = 0;
> +	struct aarch64_insn_patch *pp = arg;
> +
> +	if (atomic_read(&text_patch_id) == smp_processor_id()) {

You could actually pass in the test_patch_id as zero-initialised parameter
to this function (i.e. it points to something on the stack of the guy
issuing the stop_machine). Then you do an inc_return here. If you see zero,
you go ahead and do the patching.

> +		for (i = 0; ret == 0 && i < pp->insn_cnt; i++)
> +			ret = aarch64_insn_patch_text_nosync(pp->text_addrs[i],
> +							     pp->new_insns[i]);
> +		/* Let other CPU continue */
> +		atomic_set(&text_patch_id, -1);

You're relying on the barrier in flush_icache_range to order this
atomic_set. I think you should add a comment describing that, because it's
very subtle.

> +	} else {
> +		while (atomic_read(&text_patch_id) != -1)
> +			cpu_relax();
> +		isb();
> +	}
> +
> +	return ret;
> +}

I don't think you need the isb -- the exception return should do the trick
(but again, a comment would be useful).

> +
> +int __kprobes aarch64_insn_patch_text_sync(void *addrs[], u32 insns[], int cnt)
> +{
> +	int ret;
> +	struct aarch64_insn_patch patch = {
> +		.text_addrs = addrs,
> +		.new_insns = insns,
> +		.insn_cnt = cnt,
> +	};
> +
> +	if (cnt <= 0)
> +		return -EINVAL;
> +
> +	mutex_lock(&text_patch_lock);

Again, if you use a loacl variable, I don't think you need the mutex. What
do you think?

> +	atomic_set(&text_patch_id, smp_processor_id());
> +	ret = stop_machine(aarch64_insn_patch_text_cb, &patch, cpu_online_mask);

Instead of doing this, why not instead call aarch64_insn_patch_text_nosync
inline, then call kick_all_cpus_sync immediately afterwards, without the
need to stop_machine.

Will

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v5 4/7] arm64: introduce aarch64_insn_gen_{nop|branch_imm}() helper functions
  2013-10-18 15:19   ` Jiang Liu
@ 2013-10-30  0:48     ` Will Deacon
  -1 siblings, 0 replies; 54+ messages in thread
From: Will Deacon @ 2013-10-30  0:48 UTC (permalink / raw)
  To: Jiang Liu
  Cc: Steven Rostedt, Catalin Marinas, Sandeepa Prabhu, Jiang Liu,
	linux-arm-kernel, linux-kernel

On Fri, Oct 18, 2013 at 04:19:58PM +0100, Jiang Liu wrote:
> From: Jiang Liu <jiang.liu@huawei.com>
> 
> Introduce aarch64_insn_gen_{nop|branch_imm}() helper functions, which
> will be used to implement jump label on ARM64.
> 
> Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
> Cc: Jiang Liu <liuj97@gmail.com>
> ---
>  arch/arm64/include/asm/insn.h | 13 +++++++++++++
>  arch/arm64/kernel/insn.c      | 28 ++++++++++++++++++++++++++++
>  2 files changed, 41 insertions(+)
> 
> diff --git a/arch/arm64/include/asm/insn.h b/arch/arm64/include/asm/insn.h
> index 8f94e48..ee4a60d 100644
> --- a/arch/arm64/include/asm/insn.h
> +++ b/arch/arm64/include/asm/insn.h
> @@ -96,9 +96,22 @@ static __always_inline void aarch64_insn_write(void *addr, u32 insn)
>  	*(u32 *)addr = cpu_to_le32(insn);
>  }
>  
> +static __always_inline u32 aarch64_insn_gen_hint(enum aarch64_insn_hint_op op)
> +{
> +	return aarch64_insn_get_hint_value() | op;

What's the point in that helper function? Does it just return an immediate
value (and are there are other users of it?).

> +}
> +
> +static __always_inline u32 aarch64_insn_gen_nop(void)
> +{
> +	return aarch64_insn_gen_hint(AARCH64_INSN_HINT_NOP);
> +}

Either use plain old `inline' or write these as preprocessor macros.

> +
>  enum aarch64_insn_encoding_class aarch64_get_insn_class(u32 insn);
>  u32 aarch64_insn_encode_immediate(enum aarch64_insn_imm_type type,
>  				  u32 insn, u64 imm);
> +u32 aarch64_insn_gen_branch_imm(unsigned long pc, unsigned long addr,
> +				bool link);
> +
>  bool aarch64_insn_hotpatch_safe(u32 old_insn, u32 new_insn);
>  
>  int aarch64_insn_patch_text_nosync(void *addr, u32 insn);
> diff --git a/arch/arm64/kernel/insn.c b/arch/arm64/kernel/insn.c
> index 886821f..f7498cc 100644
> --- a/arch/arm64/kernel/insn.c
> +++ b/arch/arm64/kernel/insn.c
> @@ -14,6 +14,7 @@
>   * You should have received a copy of the GNU General Public License
>   * along with this program.  If not, see <http://www.gnu.org/licenses/>.
>   */
> +#include <linux/bitops.h>
>  #include <linux/compiler.h>
>  #include <linux/kernel.h>
>  #include <linux/smp.h>
> @@ -233,3 +234,30 @@ u32 aarch64_insn_encode_immediate(enum aarch64_insn_imm_type type,
>  
>  	return insn;
>  }
> +
> +u32 aarch64_insn_gen_branch_imm(unsigned long pc, unsigned long addr, bool link)
> +{
> +	u32 insn;
> +	long offset;
> +
> +	/*
> +	 * PC: A 64-bit Program Counter holding the address of the current
> +	 * instruction. A64 instructions may be word-aligned.
> +	 */
> +	BUG_ON((pc & 0x3) || (addr & 0x3));
> +
> +	/*
> +	 * B/BL support [-128M, 128M) offset
> +	 * ARM64 virtual address arrangement garantees all kernel and module

Typo: `guarantees'.

> +	 * texts are within +/-128M.
> +	 */
> +	offset = ((long)addr - (long)pc) >> 2;
> +	BUG_ON(abs(offset) > BIT(25) || offset == BIT(25));

I really struggle to follow this range checking. Why don't you just compare
the absolute difference between addr and pc with SZ_128M?

Will

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH v5 4/7] arm64: introduce aarch64_insn_gen_{nop|branch_imm}() helper functions
@ 2013-10-30  0:48     ` Will Deacon
  0 siblings, 0 replies; 54+ messages in thread
From: Will Deacon @ 2013-10-30  0:48 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Oct 18, 2013 at 04:19:58PM +0100, Jiang Liu wrote:
> From: Jiang Liu <jiang.liu@huawei.com>
> 
> Introduce aarch64_insn_gen_{nop|branch_imm}() helper functions, which
> will be used to implement jump label on ARM64.
> 
> Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
> Cc: Jiang Liu <liuj97@gmail.com>
> ---
>  arch/arm64/include/asm/insn.h | 13 +++++++++++++
>  arch/arm64/kernel/insn.c      | 28 ++++++++++++++++++++++++++++
>  2 files changed, 41 insertions(+)
> 
> diff --git a/arch/arm64/include/asm/insn.h b/arch/arm64/include/asm/insn.h
> index 8f94e48..ee4a60d 100644
> --- a/arch/arm64/include/asm/insn.h
> +++ b/arch/arm64/include/asm/insn.h
> @@ -96,9 +96,22 @@ static __always_inline void aarch64_insn_write(void *addr, u32 insn)
>  	*(u32 *)addr = cpu_to_le32(insn);
>  }
>  
> +static __always_inline u32 aarch64_insn_gen_hint(enum aarch64_insn_hint_op op)
> +{
> +	return aarch64_insn_get_hint_value() | op;

What's the point in that helper function? Does it just return an immediate
value (and are there are other users of it?).

> +}
> +
> +static __always_inline u32 aarch64_insn_gen_nop(void)
> +{
> +	return aarch64_insn_gen_hint(AARCH64_INSN_HINT_NOP);
> +}

Either use plain old `inline' or write these as preprocessor macros.

> +
>  enum aarch64_insn_encoding_class aarch64_get_insn_class(u32 insn);
>  u32 aarch64_insn_encode_immediate(enum aarch64_insn_imm_type type,
>  				  u32 insn, u64 imm);
> +u32 aarch64_insn_gen_branch_imm(unsigned long pc, unsigned long addr,
> +				bool link);
> +
>  bool aarch64_insn_hotpatch_safe(u32 old_insn, u32 new_insn);
>  
>  int aarch64_insn_patch_text_nosync(void *addr, u32 insn);
> diff --git a/arch/arm64/kernel/insn.c b/arch/arm64/kernel/insn.c
> index 886821f..f7498cc 100644
> --- a/arch/arm64/kernel/insn.c
> +++ b/arch/arm64/kernel/insn.c
> @@ -14,6 +14,7 @@
>   * You should have received a copy of the GNU General Public License
>   * along with this program.  If not, see <http://www.gnu.org/licenses/>.
>   */
> +#include <linux/bitops.h>
>  #include <linux/compiler.h>
>  #include <linux/kernel.h>
>  #include <linux/smp.h>
> @@ -233,3 +234,30 @@ u32 aarch64_insn_encode_immediate(enum aarch64_insn_imm_type type,
>  
>  	return insn;
>  }
> +
> +u32 aarch64_insn_gen_branch_imm(unsigned long pc, unsigned long addr, bool link)
> +{
> +	u32 insn;
> +	long offset;
> +
> +	/*
> +	 * PC: A 64-bit Program Counter holding the address of the current
> +	 * instruction. A64 instructions may be word-aligned.
> +	 */
> +	BUG_ON((pc & 0x3) || (addr & 0x3));
> +
> +	/*
> +	 * B/BL support [-128M, 128M) offset
> +	 * ARM64 virtual address arrangement garantees all kernel and module

Typo: `guarantees'.

> +	 * texts are within +/-128M.
> +	 */
> +	offset = ((long)addr - (long)pc) >> 2;
> +	BUG_ON(abs(offset) > BIT(25) || offset == BIT(25));

I really struggle to follow this range checking. Why don't you just compare
the absolute difference between addr and pc with SZ_128M?

Will

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v5 5/7] arm64, jump label: detect %c support for ARM64
  2013-10-18 15:19   ` Jiang Liu
@ 2013-10-30  0:49     ` Will Deacon
  -1 siblings, 0 replies; 54+ messages in thread
From: Will Deacon @ 2013-10-30  0:49 UTC (permalink / raw)
  To: Jiang Liu
  Cc: Steven Rostedt, Catalin Marinas, Sandeepa Prabhu, Jiang Liu,
	linux-kernel, linux-arm-kernel

On Fri, Oct 18, 2013 at 04:19:59PM +0100, Jiang Liu wrote:
> From: Jiang Liu <jiang.liu@huawei.com>
> 
> As commit a9468f30b5eac6 "ARM: 7333/2: jump label: detect %c
> support for ARM", this patch detects the same thing for ARM64
> because some ARM64 GCC versions have the same issue.
> 
> Some versions of ARM64 GCC which do support asm goto, do not
> support the %c specifier. Since we need the %c to support jump
> labels on ARM64, detect that too in the asm goto detection script
> to avoid build errors with these versions.

Acked-by: Will Deacon <will.deacon@arm.com>

The toolchain is being fixed upstream for this, but we should have the check
for older versions.

Will

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH v5 5/7] arm64, jump label: detect %c support for ARM64
@ 2013-10-30  0:49     ` Will Deacon
  0 siblings, 0 replies; 54+ messages in thread
From: Will Deacon @ 2013-10-30  0:49 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Oct 18, 2013 at 04:19:59PM +0100, Jiang Liu wrote:
> From: Jiang Liu <jiang.liu@huawei.com>
> 
> As commit a9468f30b5eac6 "ARM: 7333/2: jump label: detect %c
> support for ARM", this patch detects the same thing for ARM64
> because some ARM64 GCC versions have the same issue.
> 
> Some versions of ARM64 GCC which do support asm goto, do not
> support the %c specifier. Since we need the %c to support jump
> labels on ARM64, detect that too in the asm goto detection script
> to avoid build errors with these versions.

Acked-by: Will Deacon <will.deacon@arm.com>

The toolchain is being fixed upstream for this, but we should have the check
for older versions.

Will

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v5 6/7] arm64, jump label: optimize jump label implementation
  2013-10-18 15:20   ` Jiang Liu
@ 2013-10-30  0:55     ` Will Deacon
  -1 siblings, 0 replies; 54+ messages in thread
From: Will Deacon @ 2013-10-30  0:55 UTC (permalink / raw)
  To: Jiang Liu
  Cc: Steven Rostedt, Catalin Marinas, Sandeepa Prabhu, Jiang Liu,
	Marc Zyngier, Arnd Bergmann, linux-arm-kernel, linux-kernel

On Fri, Oct 18, 2013 at 04:20:00PM +0100, Jiang Liu wrote:
> From: Jiang Liu <jiang.liu@huawei.com>
> 
> Optimize jump label implementation for ARM64 by dynamically patching
> kernel text.
> 
> Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
> Cc: Jiang Liu <liuj97@gmail.com>
> ---
>  arch/arm64/Kconfig                  |  1 +
>  arch/arm64/include/asm/jump_label.h | 52 +++++++++++++++++++++++++++++++++
>  arch/arm64/kernel/Makefile          |  1 +
>  arch/arm64/kernel/jump_label.c      | 58 +++++++++++++++++++++++++++++++++++++
>  4 files changed, 112 insertions(+)
>  create mode 100644 arch/arm64/include/asm/jump_label.h
>  create mode 100644 arch/arm64/kernel/jump_label.c
> 
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index c044548..da388e4 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -17,6 +17,7 @@ config ARM64
>  	select GENERIC_SMP_IDLE_THREAD
>  	select GENERIC_TIME_VSYSCALL
>  	select HARDIRQS_SW_RESEND
> +	select HAVE_ARCH_JUMP_LABEL
>  	select HAVE_ARCH_TRACEHOOK
>  	select HAVE_DEBUG_BUGVERBOSE
>  	select HAVE_DEBUG_KMEMLEAK
> diff --git a/arch/arm64/include/asm/jump_label.h b/arch/arm64/include/asm/jump_label.h
> new file mode 100644
> index 0000000..d268fab
> --- /dev/null
> +++ b/arch/arm64/include/asm/jump_label.h
> @@ -0,0 +1,52 @@
> +/*
> + * Copyright (C) 2013 Huawei Ltd.
> + * Author: Jiang Liu <jiang.liu@huawei.com>
> + *
> + * Based on arch/arm/include/asm/jump_label.h
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program.  If not, see <http://www.gnu.org/licenses/>.
> + */
> +#ifndef _ASM_ARM64_JUMP_LABEL_H
> +#define _ASM_ARM64_JUMP_LABEL_H
> +#include <linux/types.h>
> +
> +#ifdef __KERNEL__
> +
> +#define JUMP_LABEL_NOP_SIZE 4
> +
> +static __always_inline bool arch_static_branch(struct static_key *key)
> +{
> +	asm goto("1:\n\t"
> +		 "nop\n\t"

Minor nit: can you put the label on the same line as the nop please?

> +		 ".pushsection __jump_table,  \"aw\"\n\t"
> +		 ".align 3\n\t"
> +		 ".quad 1b, %l[l_yes], %c0\n\t"
> +		 ".popsection\n\t"
> +		 :  :  "i"(key) :  : l_yes);
> +
> +	return false;
> +l_yes:
> +	return true;
> +}
> +
> +#endif /* __KERNEL__ */
> +
> +typedef u64 jump_label_t;
> +
> +struct jump_entry {
> +	jump_label_t code;
> +	jump_label_t target;
> +	jump_label_t key;
> +};
> +
> +#endif	/* _ASM_ARM64_JUMP_LABEL_H */
> diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
> index 9af6cb3..b7db65e 100644
> --- a/arch/arm64/kernel/Makefile
> +++ b/arch/arm64/kernel/Makefile
> @@ -18,6 +18,7 @@ arm64-obj-$(CONFIG_SMP)			+= smp.o smp_spin_table.o smp_psci.o
>  arm64-obj-$(CONFIG_HW_PERF_EVENTS)	+= perf_event.o
>  arm64-obj-$(CONFIG_HAVE_HW_BREAKPOINT)+= hw_breakpoint.o
>  arm64-obj-$(CONFIG_EARLY_PRINTK)	+= early_printk.o
> +arm64-obj-$(CONFIG_JUMP_LABEL)		+= jump_label.o
>  
>  obj-y					+= $(arm64-obj-y) vdso/
>  obj-m					+= $(arm64-obj-m)
> diff --git a/arch/arm64/kernel/jump_label.c b/arch/arm64/kernel/jump_label.c
> new file mode 100644
> index 0000000..871786a
> --- /dev/null
> +++ b/arch/arm64/kernel/jump_label.c
> @@ -0,0 +1,58 @@
> +/*
> + * Copyright (C) 2013 Huawei Ltd.
> + * Author: Jiang Liu <jiang.liu@huawei.com>
> + *
> + * Based on arch/arm/kernel/jump_label.c
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program.  If not, see <http://www.gnu.org/licenses/>.
> + */
> +#include <linux/kernel.h>
> +#include <linux/jump_label.h>
> +#include <asm/jump_label.h>
> +#include <asm/insn.h>
> +
> +#ifdef HAVE_JUMP_LABEL
> +
> +static void __arch_jump_label_transform(struct jump_entry *entry,
> +					enum jump_label_type type,
> +					bool is_static)
> +{
> +	void *addr = (void *)entry->code;
> +	u32 insn;
> +
> +	if (type == JUMP_LABEL_ENABLE) {
> +		insn = aarch64_insn_gen_branch_imm(entry->code,
> +						   entry->target, 0);

Maybe have an enum or #define for the last parameter -- it's not at all
obvious that '0' means `no link'.

> +	} else {
> +		insn = aarch64_insn_gen_nop();
> +	}
> +
> +	if (is_static)
> +		aarch64_insn_patch_text_nosync(&addr, &insn, 1);

I think I mentioned this before, but is it ever safe to patch more than one
instruction in the _nosync case? If not, I wouldn't even bother with that
last argument.

Will

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH v5 6/7] arm64, jump label: optimize jump label implementation
@ 2013-10-30  0:55     ` Will Deacon
  0 siblings, 0 replies; 54+ messages in thread
From: Will Deacon @ 2013-10-30  0:55 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Oct 18, 2013 at 04:20:00PM +0100, Jiang Liu wrote:
> From: Jiang Liu <jiang.liu@huawei.com>
> 
> Optimize jump label implementation for ARM64 by dynamically patching
> kernel text.
> 
> Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
> Cc: Jiang Liu <liuj97@gmail.com>
> ---
>  arch/arm64/Kconfig                  |  1 +
>  arch/arm64/include/asm/jump_label.h | 52 +++++++++++++++++++++++++++++++++
>  arch/arm64/kernel/Makefile          |  1 +
>  arch/arm64/kernel/jump_label.c      | 58 +++++++++++++++++++++++++++++++++++++
>  4 files changed, 112 insertions(+)
>  create mode 100644 arch/arm64/include/asm/jump_label.h
>  create mode 100644 arch/arm64/kernel/jump_label.c
> 
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index c044548..da388e4 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -17,6 +17,7 @@ config ARM64
>  	select GENERIC_SMP_IDLE_THREAD
>  	select GENERIC_TIME_VSYSCALL
>  	select HARDIRQS_SW_RESEND
> +	select HAVE_ARCH_JUMP_LABEL
>  	select HAVE_ARCH_TRACEHOOK
>  	select HAVE_DEBUG_BUGVERBOSE
>  	select HAVE_DEBUG_KMEMLEAK
> diff --git a/arch/arm64/include/asm/jump_label.h b/arch/arm64/include/asm/jump_label.h
> new file mode 100644
> index 0000000..d268fab
> --- /dev/null
> +++ b/arch/arm64/include/asm/jump_label.h
> @@ -0,0 +1,52 @@
> +/*
> + * Copyright (C) 2013 Huawei Ltd.
> + * Author: Jiang Liu <jiang.liu@huawei.com>
> + *
> + * Based on arch/arm/include/asm/jump_label.h
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program.  If not, see <http://www.gnu.org/licenses/>.
> + */
> +#ifndef _ASM_ARM64_JUMP_LABEL_H
> +#define _ASM_ARM64_JUMP_LABEL_H
> +#include <linux/types.h>
> +
> +#ifdef __KERNEL__
> +
> +#define JUMP_LABEL_NOP_SIZE 4
> +
> +static __always_inline bool arch_static_branch(struct static_key *key)
> +{
> +	asm goto("1:\n\t"
> +		 "nop\n\t"

Minor nit: can you put the label on the same line as the nop please?

> +		 ".pushsection __jump_table,  \"aw\"\n\t"
> +		 ".align 3\n\t"
> +		 ".quad 1b, %l[l_yes], %c0\n\t"
> +		 ".popsection\n\t"
> +		 :  :  "i"(key) :  : l_yes);
> +
> +	return false;
> +l_yes:
> +	return true;
> +}
> +
> +#endif /* __KERNEL__ */
> +
> +typedef u64 jump_label_t;
> +
> +struct jump_entry {
> +	jump_label_t code;
> +	jump_label_t target;
> +	jump_label_t key;
> +};
> +
> +#endif	/* _ASM_ARM64_JUMP_LABEL_H */
> diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
> index 9af6cb3..b7db65e 100644
> --- a/arch/arm64/kernel/Makefile
> +++ b/arch/arm64/kernel/Makefile
> @@ -18,6 +18,7 @@ arm64-obj-$(CONFIG_SMP)			+= smp.o smp_spin_table.o smp_psci.o
>  arm64-obj-$(CONFIG_HW_PERF_EVENTS)	+= perf_event.o
>  arm64-obj-$(CONFIG_HAVE_HW_BREAKPOINT)+= hw_breakpoint.o
>  arm64-obj-$(CONFIG_EARLY_PRINTK)	+= early_printk.o
> +arm64-obj-$(CONFIG_JUMP_LABEL)		+= jump_label.o
>  
>  obj-y					+= $(arm64-obj-y) vdso/
>  obj-m					+= $(arm64-obj-m)
> diff --git a/arch/arm64/kernel/jump_label.c b/arch/arm64/kernel/jump_label.c
> new file mode 100644
> index 0000000..871786a
> --- /dev/null
> +++ b/arch/arm64/kernel/jump_label.c
> @@ -0,0 +1,58 @@
> +/*
> + * Copyright (C) 2013 Huawei Ltd.
> + * Author: Jiang Liu <jiang.liu@huawei.com>
> + *
> + * Based on arch/arm/kernel/jump_label.c
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program.  If not, see <http://www.gnu.org/licenses/>.
> + */
> +#include <linux/kernel.h>
> +#include <linux/jump_label.h>
> +#include <asm/jump_label.h>
> +#include <asm/insn.h>
> +
> +#ifdef HAVE_JUMP_LABEL
> +
> +static void __arch_jump_label_transform(struct jump_entry *entry,
> +					enum jump_label_type type,
> +					bool is_static)
> +{
> +	void *addr = (void *)entry->code;
> +	u32 insn;
> +
> +	if (type == JUMP_LABEL_ENABLE) {
> +		insn = aarch64_insn_gen_branch_imm(entry->code,
> +						   entry->target, 0);

Maybe have an enum or #define for the last parameter -- it's not at all
obvious that '0' means `no link'.

> +	} else {
> +		insn = aarch64_insn_gen_nop();
> +	}
> +
> +	if (is_static)
> +		aarch64_insn_patch_text_nosync(&addr, &insn, 1);

I think I mentioned this before, but is it ever safe to patch more than one
instruction in the _nosync case? If not, I wouldn't even bother with that
last argument.

Will

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v5 1/7] arm64: introduce basic aarch64 instruction decoding helpers
  2013-10-18 15:19   ` Jiang Liu
@ 2013-10-31 17:39     ` Catalin Marinas
  -1 siblings, 0 replies; 54+ messages in thread
From: Catalin Marinas @ 2013-10-31 17:39 UTC (permalink / raw)
  To: Jiang Liu
  Cc: Steven Rostedt, Will Deacon, Sandeepa Prabhu, Jiang Liu,
	Marc Zyngier, Arnd Bergmann, linux-arm-kernel, linux-kernel

On Fri, Oct 18, 2013 at 04:19:55PM +0100, Jiang Liu wrote:
> +/*
> + * ARMv8-A Section B2.6.5:

Minor comment - please specify the document revision and the title of
the section as they keep changing in subsequent ARMv8 ARM updates and
this comment would no longer match.

-- 
Catalin

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH v5 1/7] arm64: introduce basic aarch64 instruction decoding helpers
@ 2013-10-31 17:39     ` Catalin Marinas
  0 siblings, 0 replies; 54+ messages in thread
From: Catalin Marinas @ 2013-10-31 17:39 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Oct 18, 2013 at 04:19:55PM +0100, Jiang Liu wrote:
> +/*
> + * ARMv8-A Section B2.6.5:

Minor comment - please specify the document revision and the title of
the section as they keep changing in subsequent ARMv8 ARM updates and
this comment would no longer match.

-- 
Catalin

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v5 2/7] arm64: introduce interfaces to hotpatch kernel and module code
  2013-10-30  0:12     ` Will Deacon
@ 2013-11-03 15:55       ` Jiang Liu
  -1 siblings, 0 replies; 54+ messages in thread
From: Jiang Liu @ 2013-11-03 15:55 UTC (permalink / raw)
  To: Will Deacon
  Cc: Steven Rostedt, Catalin Marinas, Sandeepa Prabhu, Jiang Liu,
	linux-arm-kernel, linux-kernel

On 10/30/2013 08:12 AM, Will Deacon wrote:
> Hi Jinag Liu,
> 
> Sorry for the delayed review, I've been travelling.
> 
> On Fri, Oct 18, 2013 at 04:19:56PM +0100, Jiang Liu wrote:
>> From: Jiang Liu <jiang.liu@huawei.com>
> 
> If I try and email you at your Huawei address, I get a bounce from the mail
> server. Is that expected? If so, it's not very helpful from a commit log
> perspective if you use that address as the author on your patches.
> 
Hi Will,
	Sorry for the inconvenience. I have left Huawei recently and
have had a vacation last two weeks. I will use my gmail account next
time.

>> Introduce three interfaces to patch kernel and module code:
>> aarch64_insn_patch_text_nosync():
>> 	patch code without synchronization, it's caller's responsibility
>> 	to synchronize all CPUs if needed.
>> aarch64_insn_patch_text_sync():
>> 	patch code and always synchronize with stop_machine()
>> aarch64_insn_patch_text():
>> 	patch code and synchronize with stop_machine() if needed
>>
>> Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
>> Cc: Jiang Liu <liuj97@gmail.com>
>> ---
>>  arch/arm64/include/asm/insn.h | 19 ++++++++-
>>  arch/arm64/kernel/insn.c      | 91 +++++++++++++++++++++++++++++++++++++++++++
>>  2 files changed, 109 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/arm64/include/asm/insn.h b/arch/arm64/include/asm/insn.h
>> index 7499490..7a69491 100644
>> --- a/arch/arm64/include/asm/insn.h
>> +++ b/arch/arm64/include/asm/insn.h
>> @@ -71,8 +71,25 @@ enum aarch64_insn_hint_op {
>>  
>>  bool aarch64_insn_is_nop(u32 insn);
>>  
>> -enum aarch64_insn_encoding_class aarch64_get_insn_class(u32 insn);
>> +/*
>> + * In ARMv8-A, A64 instructions have a fixed length of 32 bits and are always
>> + * little-endian.
>> + */
>> +static __always_inline u32 aarch64_insn_read(void *addr)
>> +{
>> +	return le32_to_cpu(*(u32 *)addr);
>> +}
>>  
>> +static __always_inline void aarch64_insn_write(void *addr, u32 insn)
>> +{
>> +	*(u32 *)addr = cpu_to_le32(insn);
>> +}
> 
> I wouldn't bother with these helpers. You should probably be using
> probe_kernel_address or similar, then doing the endianness swabbing on the
> return value in-line.
How about keeping and refining aarch64_insn_read/write interfaces
by using probe_kernel_address()? I think they may be used in other
places when supporting big endian ARM64 kernel.

> 
>> +enum aarch64_insn_encoding_class aarch64_get_insn_class(u32 insn);
>>  bool aarch64_insn_hotpatch_safe(u32 old_insn, u32 new_insn);
>>  
>> +int aarch64_insn_patch_text_nosync(void *addr, u32 insn);
>> +int aarch64_insn_patch_text_sync(void *addrs[], u32 insns[], int cnt);
>> +int aarch64_insn_patch_text(void *addrs[], u32 insns[], int cnt);
>> +
>>  #endif	/* __ASM_INSN_H */
>> diff --git a/arch/arm64/kernel/insn.c b/arch/arm64/kernel/insn.c
>> index f5b779f..3879db4 100644
>> --- a/arch/arm64/kernel/insn.c
>> +++ b/arch/arm64/kernel/insn.c
>> @@ -16,6 +16,9 @@
>>   */
>>  #include <linux/compiler.h>
>>  #include <linux/kernel.h>
>> +#include <linux/smp.h>
>> +#include <linux/stop_machine.h>
>> +#include <asm/cacheflush.h>
>>  #include <asm/insn.h>
>>  
>>  static int aarch64_insn_encoding_class[] = {
>> @@ -88,3 +91,91 @@ bool __kprobes aarch64_insn_hotpatch_safe(u32 old_insn, u32 new_insn)
>>  	return __aarch64_insn_hotpatch_safe(old_insn) &&
>>  	       __aarch64_insn_hotpatch_safe(new_insn);
>>  }
>> +
>> +int __kprobes aarch64_insn_patch_text_nosync(void *addr, u32 insn)
>> +{
>> +	u32 *tp = addr;
>> +
>> +	/* A64 instructions must be word aligned */
>> +	if ((uintptr_t)tp & 0x3)
>> +		return -EINVAL;
>> +
>> +	aarch64_insn_write(tp, insn);
>> +	flush_icache_range((uintptr_t)tp, (uintptr_t)tp + sizeof(u32));
> 
> sizeof(insn) is clearer here.
> 
Will make this change in next version.

>> +
>> +	return 0;
>> +}
>> +
>> +struct aarch64_insn_patch {
>> +	void	**text_addrs;
>> +	u32	*new_insns;
>> +	int	insn_cnt;
>> +};
>> +
>> +static DEFINE_MUTEX(text_patch_lock);
>> +static atomic_t text_patch_id;
>> +
>> +static int __kprobes aarch64_insn_patch_text_cb(void *arg)
>> +{
>> +	int i, ret = 0;
>> +	struct aarch64_insn_patch *pp = arg;
>> +
>> +	if (atomic_read(&text_patch_id) == smp_processor_id()) {
> 
> You could actually pass in the test_patch_id as zero-initialised parameter
> to this function (i.e. it points to something on the stack of the guy
> issuing the stop_machine). Then you do an inc_return here. If you see zero,
> you go ahead and do the patching.
Good suggestion!
Function stop_machine() already has a mutex to serialize all callers,
so we don't need explicitly serialization here. Will simplify the
code in next version.

> 
>> +		for (i = 0; ret == 0 && i < pp->insn_cnt; i++)
>> +			ret = aarch64_insn_patch_text_nosync(pp->text_addrs[i],
>> +							     pp->new_insns[i]);
>> +		/* Let other CPU continue */
>> +		atomic_set(&text_patch_id, -1);
> 
> You're relying on the barrier in flush_icache_range to order this
> atomic_set. I think you should add a comment describing that, because it's
> very subtle.
How about an explicitly smp_wmb() here? That would be more
maintainable.

> 
>> +	} else {
>> +		while (atomic_read(&text_patch_id) != -1)
>> +			cpu_relax();
>> +		isb();
>> +	}
>> +
>> +	return ret;
>> +}
> 
> I don't think you need the isb -- the exception return should do the trick
> (but again, a comment would be useful).
stop_machine() is implemented by thread scheduling instead of cross CPU
function call(IPI), so there may be no "eret" after returning from
this callback function. So used an "isb" here.

> 
>> +
>> +int __kprobes aarch64_insn_patch_text_sync(void *addrs[], u32 insns[], int cnt)
>> +{
>> +	int ret;
>> +	struct aarch64_insn_patch patch = {
>> +		.text_addrs = addrs,
>> +		.new_insns = insns,
>> +		.insn_cnt = cnt,
>> +	};
>> +
>> +	if (cnt <= 0)
>> +		return -EINVAL;
>> +
>> +	mutex_lock(&text_patch_lock);
> 
> Again, if you use a loacl variable, I don't think you need the mutex. What
> do you think?
Sure, will make the change.

> 
>> +	atomic_set(&text_patch_id, smp_processor_id());
>> +	ret = stop_machine(aarch64_insn_patch_text_cb, &patch, cpu_online_mask);
> 
> Instead of doing this, why not instead call aarch64_insn_patch_text_nosync
> inline, then call kick_all_cpus_sync immediately afterwards, without the
> need to stop_machine.
Sandeepa, who is working on kprobe for ARM64, needs the stop_machine()
mechanism to synchronize all online CPUs, so it's a preparation for
kprobe.

Thanks!
Gerry
> 
> Will
> 


^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH v5 2/7] arm64: introduce interfaces to hotpatch kernel and module code
@ 2013-11-03 15:55       ` Jiang Liu
  0 siblings, 0 replies; 54+ messages in thread
From: Jiang Liu @ 2013-11-03 15:55 UTC (permalink / raw)
  To: linux-arm-kernel

On 10/30/2013 08:12 AM, Will Deacon wrote:
> Hi Jinag Liu,
> 
> Sorry for the delayed review, I've been travelling.
> 
> On Fri, Oct 18, 2013 at 04:19:56PM +0100, Jiang Liu wrote:
>> From: Jiang Liu <jiang.liu@huawei.com>
> 
> If I try and email you at your Huawei address, I get a bounce from the mail
> server. Is that expected? If so, it's not very helpful from a commit log
> perspective if you use that address as the author on your patches.
> 
Hi Will,
	Sorry for the inconvenience. I have left Huawei recently and
have had a vacation last two weeks. I will use my gmail account next
time.

>> Introduce three interfaces to patch kernel and module code:
>> aarch64_insn_patch_text_nosync():
>> 	patch code without synchronization, it's caller's responsibility
>> 	to synchronize all CPUs if needed.
>> aarch64_insn_patch_text_sync():
>> 	patch code and always synchronize with stop_machine()
>> aarch64_insn_patch_text():
>> 	patch code and synchronize with stop_machine() if needed
>>
>> Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
>> Cc: Jiang Liu <liuj97@gmail.com>
>> ---
>>  arch/arm64/include/asm/insn.h | 19 ++++++++-
>>  arch/arm64/kernel/insn.c      | 91 +++++++++++++++++++++++++++++++++++++++++++
>>  2 files changed, 109 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/arm64/include/asm/insn.h b/arch/arm64/include/asm/insn.h
>> index 7499490..7a69491 100644
>> --- a/arch/arm64/include/asm/insn.h
>> +++ b/arch/arm64/include/asm/insn.h
>> @@ -71,8 +71,25 @@ enum aarch64_insn_hint_op {
>>  
>>  bool aarch64_insn_is_nop(u32 insn);
>>  
>> -enum aarch64_insn_encoding_class aarch64_get_insn_class(u32 insn);
>> +/*
>> + * In ARMv8-A, A64 instructions have a fixed length of 32 bits and are always
>> + * little-endian.
>> + */
>> +static __always_inline u32 aarch64_insn_read(void *addr)
>> +{
>> +	return le32_to_cpu(*(u32 *)addr);
>> +}
>>  
>> +static __always_inline void aarch64_insn_write(void *addr, u32 insn)
>> +{
>> +	*(u32 *)addr = cpu_to_le32(insn);
>> +}
> 
> I wouldn't bother with these helpers. You should probably be using
> probe_kernel_address or similar, then doing the endianness swabbing on the
> return value in-line.
How about keeping and refining aarch64_insn_read/write interfaces
by using probe_kernel_address()? I think they may be used in other
places when supporting big endian ARM64 kernel.

> 
>> +enum aarch64_insn_encoding_class aarch64_get_insn_class(u32 insn);
>>  bool aarch64_insn_hotpatch_safe(u32 old_insn, u32 new_insn);
>>  
>> +int aarch64_insn_patch_text_nosync(void *addr, u32 insn);
>> +int aarch64_insn_patch_text_sync(void *addrs[], u32 insns[], int cnt);
>> +int aarch64_insn_patch_text(void *addrs[], u32 insns[], int cnt);
>> +
>>  #endif	/* __ASM_INSN_H */
>> diff --git a/arch/arm64/kernel/insn.c b/arch/arm64/kernel/insn.c
>> index f5b779f..3879db4 100644
>> --- a/arch/arm64/kernel/insn.c
>> +++ b/arch/arm64/kernel/insn.c
>> @@ -16,6 +16,9 @@
>>   */
>>  #include <linux/compiler.h>
>>  #include <linux/kernel.h>
>> +#include <linux/smp.h>
>> +#include <linux/stop_machine.h>
>> +#include <asm/cacheflush.h>
>>  #include <asm/insn.h>
>>  
>>  static int aarch64_insn_encoding_class[] = {
>> @@ -88,3 +91,91 @@ bool __kprobes aarch64_insn_hotpatch_safe(u32 old_insn, u32 new_insn)
>>  	return __aarch64_insn_hotpatch_safe(old_insn) &&
>>  	       __aarch64_insn_hotpatch_safe(new_insn);
>>  }
>> +
>> +int __kprobes aarch64_insn_patch_text_nosync(void *addr, u32 insn)
>> +{
>> +	u32 *tp = addr;
>> +
>> +	/* A64 instructions must be word aligned */
>> +	if ((uintptr_t)tp & 0x3)
>> +		return -EINVAL;
>> +
>> +	aarch64_insn_write(tp, insn);
>> +	flush_icache_range((uintptr_t)tp, (uintptr_t)tp + sizeof(u32));
> 
> sizeof(insn) is clearer here.
> 
Will make this change in next version.

>> +
>> +	return 0;
>> +}
>> +
>> +struct aarch64_insn_patch {
>> +	void	**text_addrs;
>> +	u32	*new_insns;
>> +	int	insn_cnt;
>> +};
>> +
>> +static DEFINE_MUTEX(text_patch_lock);
>> +static atomic_t text_patch_id;
>> +
>> +static int __kprobes aarch64_insn_patch_text_cb(void *arg)
>> +{
>> +	int i, ret = 0;
>> +	struct aarch64_insn_patch *pp = arg;
>> +
>> +	if (atomic_read(&text_patch_id) == smp_processor_id()) {
> 
> You could actually pass in the test_patch_id as zero-initialised parameter
> to this function (i.e. it points to something on the stack of the guy
> issuing the stop_machine). Then you do an inc_return here. If you see zero,
> you go ahead and do the patching.
Good suggestion!
Function stop_machine() already has a mutex to serialize all callers,
so we don't need explicitly serialization here. Will simplify the
code in next version.

> 
>> +		for (i = 0; ret == 0 && i < pp->insn_cnt; i++)
>> +			ret = aarch64_insn_patch_text_nosync(pp->text_addrs[i],
>> +							     pp->new_insns[i]);
>> +		/* Let other CPU continue */
>> +		atomic_set(&text_patch_id, -1);
> 
> You're relying on the barrier in flush_icache_range to order this
> atomic_set. I think you should add a comment describing that, because it's
> very subtle.
How about an explicitly smp_wmb() here? That would be more
maintainable.

> 
>> +	} else {
>> +		while (atomic_read(&text_patch_id) != -1)
>> +			cpu_relax();
>> +		isb();
>> +	}
>> +
>> +	return ret;
>> +}
> 
> I don't think you need the isb -- the exception return should do the trick
> (but again, a comment would be useful).
stop_machine() is implemented by thread scheduling instead of cross CPU
function call(IPI), so there may be no "eret" after returning from
this callback function. So used an "isb" here.

> 
>> +
>> +int __kprobes aarch64_insn_patch_text_sync(void *addrs[], u32 insns[], int cnt)
>> +{
>> +	int ret;
>> +	struct aarch64_insn_patch patch = {
>> +		.text_addrs = addrs,
>> +		.new_insns = insns,
>> +		.insn_cnt = cnt,
>> +	};
>> +
>> +	if (cnt <= 0)
>> +		return -EINVAL;
>> +
>> +	mutex_lock(&text_patch_lock);
> 
> Again, if you use a loacl variable, I don't think you need the mutex. What
> do you think?
Sure, will make the change.

> 
>> +	atomic_set(&text_patch_id, smp_processor_id());
>> +	ret = stop_machine(aarch64_insn_patch_text_cb, &patch, cpu_online_mask);
> 
> Instead of doing this, why not instead call aarch64_insn_patch_text_nosync
> inline, then call kick_all_cpus_sync immediately afterwards, without the
> need to stop_machine.
Sandeepa, who is working on kprobe for ARM64, needs the stop_machine()
mechanism to synchronize all online CPUs, so it's a preparation for
kprobe.

Thanks!
Gerry
> 
> Will
> 

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v5 2/7] arm64: introduce interfaces to hotpatch kernel and module code
  2013-11-03 15:55       ` Jiang Liu
@ 2013-11-04 15:12         ` Sandeepa Prabhu
  -1 siblings, 0 replies; 54+ messages in thread
From: Sandeepa Prabhu @ 2013-11-04 15:12 UTC (permalink / raw)
  To: Jiang Liu
  Cc: Will Deacon, Steven Rostedt, Catalin Marinas, Jiang Liu,
	linux-arm-kernel, linux-kernel

On 3 November 2013 23:55, Jiang Liu <liuj97@gmail.com> wrote:
> On 10/30/2013 08:12 AM, Will Deacon wrote:
>> Hi Jinag Liu,
>>
>> Sorry for the delayed review, I've been travelling.
>>
>> On Fri, Oct 18, 2013 at 04:19:56PM +0100, Jiang Liu wrote:
>>> From: Jiang Liu <jiang.liu@huawei.com>
>>
>> If I try and email you at your Huawei address, I get a bounce from the mail
>> server. Is that expected? If so, it's not very helpful from a commit log
>> perspective if you use that address as the author on your patches.
>>
> Hi Will,
>         Sorry for the inconvenience. I have left Huawei recently and
> have had a vacation last two weeks. I will use my gmail account next
> time.
>
>>> Introduce three interfaces to patch kernel and module code:
>>> aarch64_insn_patch_text_nosync():
>>>      patch code without synchronization, it's caller's responsibility
>>>      to synchronize all CPUs if needed.
>>> aarch64_insn_patch_text_sync():
>>>      patch code and always synchronize with stop_machine()
>>> aarch64_insn_patch_text():
>>>      patch code and synchronize with stop_machine() if needed
>>>
>>> Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
>>> Cc: Jiang Liu <liuj97@gmail.com>
>>> ---
>>>  arch/arm64/include/asm/insn.h | 19 ++++++++-
>>>  arch/arm64/kernel/insn.c      | 91 +++++++++++++++++++++++++++++++++++++++++++
>>>  2 files changed, 109 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/arch/arm64/include/asm/insn.h b/arch/arm64/include/asm/insn.h
>>> index 7499490..7a69491 100644
>>> --- a/arch/arm64/include/asm/insn.h
>>> +++ b/arch/arm64/include/asm/insn.h
>>> @@ -71,8 +71,25 @@ enum aarch64_insn_hint_op {
>>>
>>>  bool aarch64_insn_is_nop(u32 insn);
>>>
>>> -enum aarch64_insn_encoding_class aarch64_get_insn_class(u32 insn);
>>> +/*
>>> + * In ARMv8-A, A64 instructions have a fixed length of 32 bits and are always
>>> + * little-endian.
>>> + */
>>> +static __always_inline u32 aarch64_insn_read(void *addr)
>>> +{
>>> +    return le32_to_cpu(*(u32 *)addr);
>>> +}
>>>
>>> +static __always_inline void aarch64_insn_write(void *addr, u32 insn)
>>> +{
>>> +    *(u32 *)addr = cpu_to_le32(insn);
>>> +}
>>
>> I wouldn't bother with these helpers. You should probably be using
>> probe_kernel_address or similar, then doing the endianness swabbing on the
>> return value in-line.
> How about keeping and refining aarch64_insn_read/write interfaces
> by using probe_kernel_address()? I think they may be used in other
> places when supporting big endian ARM64 kernel.
>
>>
>>> +enum aarch64_insn_encoding_class aarch64_get_insn_class(u32 insn);
>>>  bool aarch64_insn_hotpatch_safe(u32 old_insn, u32 new_insn);
>>>
>>> +int aarch64_insn_patch_text_nosync(void *addr, u32 insn);
>>> +int aarch64_insn_patch_text_sync(void *addrs[], u32 insns[], int cnt);
>>> +int aarch64_insn_patch_text(void *addrs[], u32 insns[], int cnt);
>>> +
>>>  #endif      /* __ASM_INSN_H */
>>> diff --git a/arch/arm64/kernel/insn.c b/arch/arm64/kernel/insn.c
>>> index f5b779f..3879db4 100644
>>> --- a/arch/arm64/kernel/insn.c
>>> +++ b/arch/arm64/kernel/insn.c
>>> @@ -16,6 +16,9 @@
>>>   */
>>>  #include <linux/compiler.h>
>>>  #include <linux/kernel.h>
>>> +#include <linux/smp.h>
>>> +#include <linux/stop_machine.h>
>>> +#include <asm/cacheflush.h>
>>>  #include <asm/insn.h>
>>>
>>>  static int aarch64_insn_encoding_class[] = {
>>> @@ -88,3 +91,91 @@ bool __kprobes aarch64_insn_hotpatch_safe(u32 old_insn, u32 new_insn)
>>>      return __aarch64_insn_hotpatch_safe(old_insn) &&
>>>             __aarch64_insn_hotpatch_safe(new_insn);
>>>  }
>>> +
>>> +int __kprobes aarch64_insn_patch_text_nosync(void *addr, u32 insn)
>>> +{
>>> +    u32 *tp = addr;
>>> +
>>> +    /* A64 instructions must be word aligned */
>>> +    if ((uintptr_t)tp & 0x3)
>>> +            return -EINVAL;
>>> +
>>> +    aarch64_insn_write(tp, insn);
>>> +    flush_icache_range((uintptr_t)tp, (uintptr_t)tp + sizeof(u32));
>>
>> sizeof(insn) is clearer here.
>>
> Will make this change in next version.
>
>>> +
>>> +    return 0;
>>> +}
>>> +
>>> +struct aarch64_insn_patch {
>>> +    void    **text_addrs;
>>> +    u32     *new_insns;
>>> +    int     insn_cnt;
>>> +};
>>> +
>>> +static DEFINE_MUTEX(text_patch_lock);
>>> +static atomic_t text_patch_id;
>>> +
>>> +static int __kprobes aarch64_insn_patch_text_cb(void *arg)
>>> +{
>>> +    int i, ret = 0;
>>> +    struct aarch64_insn_patch *pp = arg;
>>> +
>>> +    if (atomic_read(&text_patch_id) == smp_processor_id()) {
>>
>> You could actually pass in the test_patch_id as zero-initialised parameter
>> to this function (i.e. it points to something on the stack of the guy
>> issuing the stop_machine). Then you do an inc_return here. If you see zero,
>> you go ahead and do the patching.
> Good suggestion!
> Function stop_machine() already has a mutex to serialize all callers,
> so we don't need explicitly serialization here. Will simplify the
> code in next version.
>
>>
>>> +            for (i = 0; ret == 0 && i < pp->insn_cnt; i++)
>>> +                    ret = aarch64_insn_patch_text_nosync(pp->text_addrs[i],
>>> +                                                         pp->new_insns[i]);
>>> +            /* Let other CPU continue */
>>> +            atomic_set(&text_patch_id, -1);
>>
>> You're relying on the barrier in flush_icache_range to order this
>> atomic_set. I think you should add a comment describing that, because it's
>> very subtle.
> How about an explicitly smp_wmb() here? That would be more
> maintainable.
>
>>
>>> +    } else {
>>> +            while (atomic_read(&text_patch_id) != -1)
>>> +                    cpu_relax();
>>> +            isb();
>>> +    }
>>> +
>>> +    return ret;
>>> +}
>>
>> I don't think you need the isb -- the exception return should do the trick
>> (but again, a comment would be useful).
> stop_machine() is implemented by thread scheduling instead of cross CPU
> function call(IPI), so there may be no "eret" after returning from
> this callback function. So used an "isb" here.
>
>>
>>> +
>>> +int __kprobes aarch64_insn_patch_text_sync(void *addrs[], u32 insns[], int cnt)
>>> +{
>>> +    int ret;
>>> +    struct aarch64_insn_patch patch = {
>>> +            .text_addrs = addrs,
>>> +            .new_insns = insns,
>>> +            .insn_cnt = cnt,
>>> +    };
>>> +
>>> +    if (cnt <= 0)
>>> +            return -EINVAL;
>>> +
>>> +    mutex_lock(&text_patch_lock);
>>
>> Again, if you use a loacl variable, I don't think you need the mutex. What
>> do you think?
> Sure, will make the change.
>
>>
>>> +    atomic_set(&text_patch_id, smp_processor_id());
>>> +    ret = stop_machine(aarch64_insn_patch_text_cb, &patch, cpu_online_mask);
>>
>> Instead of doing this, why not instead call aarch64_insn_patch_text_nosync
>> inline, then call kick_all_cpus_sync immediately afterwards, without the
>> need to stop_machine.
> Sandeepa, who is working on kprobe for ARM64, needs the stop_machine()
> mechanism to synchronize all online CPUs, so it's a preparation for
> kprobe.
Hi Gerry,

I had published kprobes patches for ARM64:
http://lwn.net/Articles/570648/  and using your patcset (v3) for
patching support, it works so far.
I CCed you on my RFC but unfortunately to your huawei email not the gmail.

I can give a try with kick_all_cpus_sync but wanted to understand this
a bit  detail on hows different from stop_machine and how this work.

Thanks,
Sandeepa

>
> Thanks!
> Gerry
>>
>> Will
>>
>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH v5 2/7] arm64: introduce interfaces to hotpatch kernel and module code
@ 2013-11-04 15:12         ` Sandeepa Prabhu
  0 siblings, 0 replies; 54+ messages in thread
From: Sandeepa Prabhu @ 2013-11-04 15:12 UTC (permalink / raw)
  To: linux-arm-kernel

On 3 November 2013 23:55, Jiang Liu <liuj97@gmail.com> wrote:
> On 10/30/2013 08:12 AM, Will Deacon wrote:
>> Hi Jinag Liu,
>>
>> Sorry for the delayed review, I've been travelling.
>>
>> On Fri, Oct 18, 2013 at 04:19:56PM +0100, Jiang Liu wrote:
>>> From: Jiang Liu <jiang.liu@huawei.com>
>>
>> If I try and email you at your Huawei address, I get a bounce from the mail
>> server. Is that expected? If so, it's not very helpful from a commit log
>> perspective if you use that address as the author on your patches.
>>
> Hi Will,
>         Sorry for the inconvenience. I have left Huawei recently and
> have had a vacation last two weeks. I will use my gmail account next
> time.
>
>>> Introduce three interfaces to patch kernel and module code:
>>> aarch64_insn_patch_text_nosync():
>>>      patch code without synchronization, it's caller's responsibility
>>>      to synchronize all CPUs if needed.
>>> aarch64_insn_patch_text_sync():
>>>      patch code and always synchronize with stop_machine()
>>> aarch64_insn_patch_text():
>>>      patch code and synchronize with stop_machine() if needed
>>>
>>> Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
>>> Cc: Jiang Liu <liuj97@gmail.com>
>>> ---
>>>  arch/arm64/include/asm/insn.h | 19 ++++++++-
>>>  arch/arm64/kernel/insn.c      | 91 +++++++++++++++++++++++++++++++++++++++++++
>>>  2 files changed, 109 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/arch/arm64/include/asm/insn.h b/arch/arm64/include/asm/insn.h
>>> index 7499490..7a69491 100644
>>> --- a/arch/arm64/include/asm/insn.h
>>> +++ b/arch/arm64/include/asm/insn.h
>>> @@ -71,8 +71,25 @@ enum aarch64_insn_hint_op {
>>>
>>>  bool aarch64_insn_is_nop(u32 insn);
>>>
>>> -enum aarch64_insn_encoding_class aarch64_get_insn_class(u32 insn);
>>> +/*
>>> + * In ARMv8-A, A64 instructions have a fixed length of 32 bits and are always
>>> + * little-endian.
>>> + */
>>> +static __always_inline u32 aarch64_insn_read(void *addr)
>>> +{
>>> +    return le32_to_cpu(*(u32 *)addr);
>>> +}
>>>
>>> +static __always_inline void aarch64_insn_write(void *addr, u32 insn)
>>> +{
>>> +    *(u32 *)addr = cpu_to_le32(insn);
>>> +}
>>
>> I wouldn't bother with these helpers. You should probably be using
>> probe_kernel_address or similar, then doing the endianness swabbing on the
>> return value in-line.
> How about keeping and refining aarch64_insn_read/write interfaces
> by using probe_kernel_address()? I think they may be used in other
> places when supporting big endian ARM64 kernel.
>
>>
>>> +enum aarch64_insn_encoding_class aarch64_get_insn_class(u32 insn);
>>>  bool aarch64_insn_hotpatch_safe(u32 old_insn, u32 new_insn);
>>>
>>> +int aarch64_insn_patch_text_nosync(void *addr, u32 insn);
>>> +int aarch64_insn_patch_text_sync(void *addrs[], u32 insns[], int cnt);
>>> +int aarch64_insn_patch_text(void *addrs[], u32 insns[], int cnt);
>>> +
>>>  #endif      /* __ASM_INSN_H */
>>> diff --git a/arch/arm64/kernel/insn.c b/arch/arm64/kernel/insn.c
>>> index f5b779f..3879db4 100644
>>> --- a/arch/arm64/kernel/insn.c
>>> +++ b/arch/arm64/kernel/insn.c
>>> @@ -16,6 +16,9 @@
>>>   */
>>>  #include <linux/compiler.h>
>>>  #include <linux/kernel.h>
>>> +#include <linux/smp.h>
>>> +#include <linux/stop_machine.h>
>>> +#include <asm/cacheflush.h>
>>>  #include <asm/insn.h>
>>>
>>>  static int aarch64_insn_encoding_class[] = {
>>> @@ -88,3 +91,91 @@ bool __kprobes aarch64_insn_hotpatch_safe(u32 old_insn, u32 new_insn)
>>>      return __aarch64_insn_hotpatch_safe(old_insn) &&
>>>             __aarch64_insn_hotpatch_safe(new_insn);
>>>  }
>>> +
>>> +int __kprobes aarch64_insn_patch_text_nosync(void *addr, u32 insn)
>>> +{
>>> +    u32 *tp = addr;
>>> +
>>> +    /* A64 instructions must be word aligned */
>>> +    if ((uintptr_t)tp & 0x3)
>>> +            return -EINVAL;
>>> +
>>> +    aarch64_insn_write(tp, insn);
>>> +    flush_icache_range((uintptr_t)tp, (uintptr_t)tp + sizeof(u32));
>>
>> sizeof(insn) is clearer here.
>>
> Will make this change in next version.
>
>>> +
>>> +    return 0;
>>> +}
>>> +
>>> +struct aarch64_insn_patch {
>>> +    void    **text_addrs;
>>> +    u32     *new_insns;
>>> +    int     insn_cnt;
>>> +};
>>> +
>>> +static DEFINE_MUTEX(text_patch_lock);
>>> +static atomic_t text_patch_id;
>>> +
>>> +static int __kprobes aarch64_insn_patch_text_cb(void *arg)
>>> +{
>>> +    int i, ret = 0;
>>> +    struct aarch64_insn_patch *pp = arg;
>>> +
>>> +    if (atomic_read(&text_patch_id) == smp_processor_id()) {
>>
>> You could actually pass in the test_patch_id as zero-initialised parameter
>> to this function (i.e. it points to something on the stack of the guy
>> issuing the stop_machine). Then you do an inc_return here. If you see zero,
>> you go ahead and do the patching.
> Good suggestion!
> Function stop_machine() already has a mutex to serialize all callers,
> so we don't need explicitly serialization here. Will simplify the
> code in next version.
>
>>
>>> +            for (i = 0; ret == 0 && i < pp->insn_cnt; i++)
>>> +                    ret = aarch64_insn_patch_text_nosync(pp->text_addrs[i],
>>> +                                                         pp->new_insns[i]);
>>> +            /* Let other CPU continue */
>>> +            atomic_set(&text_patch_id, -1);
>>
>> You're relying on the barrier in flush_icache_range to order this
>> atomic_set. I think you should add a comment describing that, because it's
>> very subtle.
> How about an explicitly smp_wmb() here? That would be more
> maintainable.
>
>>
>>> +    } else {
>>> +            while (atomic_read(&text_patch_id) != -1)
>>> +                    cpu_relax();
>>> +            isb();
>>> +    }
>>> +
>>> +    return ret;
>>> +}
>>
>> I don't think you need the isb -- the exception return should do the trick
>> (but again, a comment would be useful).
> stop_machine() is implemented by thread scheduling instead of cross CPU
> function call(IPI), so there may be no "eret" after returning from
> this callback function. So used an "isb" here.
>
>>
>>> +
>>> +int __kprobes aarch64_insn_patch_text_sync(void *addrs[], u32 insns[], int cnt)
>>> +{
>>> +    int ret;
>>> +    struct aarch64_insn_patch patch = {
>>> +            .text_addrs = addrs,
>>> +            .new_insns = insns,
>>> +            .insn_cnt = cnt,
>>> +    };
>>> +
>>> +    if (cnt <= 0)
>>> +            return -EINVAL;
>>> +
>>> +    mutex_lock(&text_patch_lock);
>>
>> Again, if you use a loacl variable, I don't think you need the mutex. What
>> do you think?
> Sure, will make the change.
>
>>
>>> +    atomic_set(&text_patch_id, smp_processor_id());
>>> +    ret = stop_machine(aarch64_insn_patch_text_cb, &patch, cpu_online_mask);
>>
>> Instead of doing this, why not instead call aarch64_insn_patch_text_nosync
>> inline, then call kick_all_cpus_sync immediately afterwards, without the
>> need to stop_machine.
> Sandeepa, who is working on kprobe for ARM64, needs the stop_machine()
> mechanism to synchronize all online CPUs, so it's a preparation for
> kprobe.
Hi Gerry,

I had published kprobes patches for ARM64:
http://lwn.net/Articles/570648/  and using your patcset (v3) for
patching support, it works so far.
I CCed you on my RFC but unfortunately to your huawei email not the gmail.

I can give a try with kick_all_cpus_sync but wanted to understand this
a bit  detail on hows different from stop_machine and how this work.

Thanks,
Sandeepa

>
> Thanks!
> Gerry
>>
>> Will
>>
>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v5 2/7] arm64: introduce interfaces to hotpatch kernel and module code
  2013-11-04 15:12         ` Sandeepa Prabhu
@ 2013-11-06 10:41           ` Will Deacon
  -1 siblings, 0 replies; 54+ messages in thread
From: Will Deacon @ 2013-11-06 10:41 UTC (permalink / raw)
  To: Sandeepa Prabhu
  Cc: Jiang Liu, Steven Rostedt, Catalin Marinas, Jiang Liu,
	linux-arm-kernel, linux-kernel

On Mon, Nov 04, 2013 at 03:12:56PM +0000, Sandeepa Prabhu wrote:
> On 3 November 2013 23:55, Jiang Liu <liuj97@gmail.com> wrote:
> > On 10/30/2013 08:12 AM, Will Deacon wrote:
> >> On Fri, Oct 18, 2013 at 04:19:56PM +0100, Jiang Liu wrote:
> >>> +    atomic_set(&text_patch_id, smp_processor_id());
> >>> +    ret = stop_machine(aarch64_insn_patch_text_cb, &patch, cpu_online_mask);
> >>
> >> Instead of doing this, why not instead call aarch64_insn_patch_text_nosync
> >> inline, then call kick_all_cpus_sync immediately afterwards, without the
> >> need to stop_machine.
> > Sandeepa, who is working on kprobe for ARM64, needs the stop_machine()
> > mechanism to synchronize all online CPUs, so it's a preparation for
> > kprobe.
> 
> I had published kprobes patches for ARM64:
> http://lwn.net/Articles/570648/  and using your patcset (v3) for
> patching support, it works so far.
> I CCed you on my RFC but unfortunately to your huawei email not the gmail.
> 
> I can give a try with kick_all_cpus_sync but wanted to understand this
> a bit  detail on hows different from stop_machine and how this work.

My point was just that for nosync patching, the update to the instruction
stream is atomic with respect to instruction fetch, so stop_machine seems a
bit overkill. kick_all_cpus can be used to ensure visibility of the new
instruction.

Jiang Liu seemed to imply that this isn't suitable for kprobes, but I would
like to know if/why that is the case.

Will

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH v5 2/7] arm64: introduce interfaces to hotpatch kernel and module code
@ 2013-11-06 10:41           ` Will Deacon
  0 siblings, 0 replies; 54+ messages in thread
From: Will Deacon @ 2013-11-06 10:41 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Nov 04, 2013 at 03:12:56PM +0000, Sandeepa Prabhu wrote:
> On 3 November 2013 23:55, Jiang Liu <liuj97@gmail.com> wrote:
> > On 10/30/2013 08:12 AM, Will Deacon wrote:
> >> On Fri, Oct 18, 2013 at 04:19:56PM +0100, Jiang Liu wrote:
> >>> +    atomic_set(&text_patch_id, smp_processor_id());
> >>> +    ret = stop_machine(aarch64_insn_patch_text_cb, &patch, cpu_online_mask);
> >>
> >> Instead of doing this, why not instead call aarch64_insn_patch_text_nosync
> >> inline, then call kick_all_cpus_sync immediately afterwards, without the
> >> need to stop_machine.
> > Sandeepa, who is working on kprobe for ARM64, needs the stop_machine()
> > mechanism to synchronize all online CPUs, so it's a preparation for
> > kprobe.
> 
> I had published kprobes patches for ARM64:
> http://lwn.net/Articles/570648/  and using your patcset (v3) for
> patching support, it works so far.
> I CCed you on my RFC but unfortunately to your huawei email not the gmail.
> 
> I can give a try with kick_all_cpus_sync but wanted to understand this
> a bit  detail on hows different from stop_machine and how this work.

My point was just that for nosync patching, the update to the instruction
stream is atomic with respect to instruction fetch, so stop_machine seems a
bit overkill. kick_all_cpus can be used to ensure visibility of the new
instruction.

Jiang Liu seemed to imply that this isn't suitable for kprobes, but I would
like to know if/why that is the case.

Will

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v5 1/7] arm64: introduce basic aarch64 instruction decoding helpers
  2013-10-31 17:39     ` Catalin Marinas
@ 2013-11-06 15:10       ` Jiang Liu
  -1 siblings, 0 replies; 54+ messages in thread
From: Jiang Liu @ 2013-11-06 15:10 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Steven Rostedt, Will Deacon, Sandeepa Prabhu, Marc Zyngier,
	Arnd Bergmann, linux-arm-kernel, linux-kernel

On 11/01/2013 01:39 AM, Catalin Marinas wrote:
> On Fri, Oct 18, 2013 at 04:19:55PM +0100, Jiang Liu wrote:
>> +/*
>> + * ARMv8-A Section B2.6.5:
> 
> Minor comment - please specify the document revision and the title of
> the section as they keep changing in subsequent ARMv8 ARM updates and
> this comment would no longer match.
> 
Thanks for review, will improve it in next version.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH v5 1/7] arm64: introduce basic aarch64 instruction decoding helpers
@ 2013-11-06 15:10       ` Jiang Liu
  0 siblings, 0 replies; 54+ messages in thread
From: Jiang Liu @ 2013-11-06 15:10 UTC (permalink / raw)
  To: linux-arm-kernel

On 11/01/2013 01:39 AM, Catalin Marinas wrote:
> On Fri, Oct 18, 2013 at 04:19:55PM +0100, Jiang Liu wrote:
>> +/*
>> + * ARMv8-A Section B2.6.5:
> 
> Minor comment - please specify the document revision and the title of
> the section as they keep changing in subsequent ARMv8 ARM updates and
> this comment would no longer match.
> 
Thanks for review, will improve it in next version.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v5 2/7] arm64: introduce interfaces to hotpatch kernel and module code
  2013-11-06 10:41           ` Will Deacon
@ 2013-11-06 16:12             ` Jiang Liu
  -1 siblings, 0 replies; 54+ messages in thread
From: Jiang Liu @ 2013-11-06 16:12 UTC (permalink / raw)
  To: Will Deacon, Sandeepa Prabhu
  Cc: Steven Rostedt, Catalin Marinas, Jiang Liu, linux-arm-kernel,
	linux-kernel

On 11/06/2013 06:41 PM, Will Deacon wrote:
> On Mon, Nov 04, 2013 at 03:12:56PM +0000, Sandeepa Prabhu wrote:
>> On 3 November 2013 23:55, Jiang Liu <liuj97@gmail.com> wrote:
>>> On 10/30/2013 08:12 AM, Will Deacon wrote:
>>>> On Fri, Oct 18, 2013 at 04:19:56PM +0100, Jiang Liu wrote:
>>>>> +    atomic_set(&text_patch_id, smp_processor_id());
>>>>> +    ret = stop_machine(aarch64_insn_patch_text_cb, &patch, cpu_online_mask);
>>>>
>>>> Instead of doing this, why not instead call aarch64_insn_patch_text_nosync
>>>> inline, then call kick_all_cpus_sync immediately afterwards, without the
>>>> need to stop_machine.
>>> Sandeepa, who is working on kprobe for ARM64, needs the stop_machine()
>>> mechanism to synchronize all online CPUs, so it's a preparation for
>>> kprobe.
>>
>> I had published kprobes patches for ARM64:
>> http://lwn.net/Articles/570648/  and using your patcset (v3) for
>> patching support, it works so far.
>> I CCed you on my RFC but unfortunately to your huawei email not the gmail.
>>
>> I can give a try with kick_all_cpus_sync but wanted to understand this
>> a bit  detail on hows different from stop_machine and how this work.
> 
> My point was just that for nosync patching, the update to the instruction
> stream is atomic with respect to instruction fetch, so stop_machine seems a
> bit overkill. kick_all_cpus can be used to ensure visibility of the new
> instruction.
> 
> Jiang Liu seemed to imply that this isn't suitable for kprobes, but I would
> like to know if/why that is the case.
> 
> Will
> 
Hi Will and Sandeepa,
	Seems some misunderstanding here. We provide three interfaces
here.
1) aarch64_insn_patch_text_nosync(), which patches kernel/module text
without explicitly synchronization. It may be used in cases of
1.a) There's only one CPU running during early boot stage
1.b) All other CPU has been put into safe state in kgdb.
1.c) The instructions before and after patching are both hot-patch safe.

2) aarch64_insn_patch_text_sync(), which patches kernel/module text
with explicitly synchronization by stop_machine() mechanism. It may
be used to support kprobe because kprobe may patch multiple and/or
non-hotpatch safe instructions at runtime, so we need stop_machine()
for synchronization.

3) aarch64_insn_patch_text() intelligently choose
aarch64_insn_patch_text_nosync() or aarch64_insn_patch_text_sync().

So for kprobe, I think need to use stop_machine() for synchronization.

Thanks!
Gerry


^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH v5 2/7] arm64: introduce interfaces to hotpatch kernel and module code
@ 2013-11-06 16:12             ` Jiang Liu
  0 siblings, 0 replies; 54+ messages in thread
From: Jiang Liu @ 2013-11-06 16:12 UTC (permalink / raw)
  To: linux-arm-kernel

On 11/06/2013 06:41 PM, Will Deacon wrote:
> On Mon, Nov 04, 2013 at 03:12:56PM +0000, Sandeepa Prabhu wrote:
>> On 3 November 2013 23:55, Jiang Liu <liuj97@gmail.com> wrote:
>>> On 10/30/2013 08:12 AM, Will Deacon wrote:
>>>> On Fri, Oct 18, 2013 at 04:19:56PM +0100, Jiang Liu wrote:
>>>>> +    atomic_set(&text_patch_id, smp_processor_id());
>>>>> +    ret = stop_machine(aarch64_insn_patch_text_cb, &patch, cpu_online_mask);
>>>>
>>>> Instead of doing this, why not instead call aarch64_insn_patch_text_nosync
>>>> inline, then call kick_all_cpus_sync immediately afterwards, without the
>>>> need to stop_machine.
>>> Sandeepa, who is working on kprobe for ARM64, needs the stop_machine()
>>> mechanism to synchronize all online CPUs, so it's a preparation for
>>> kprobe.
>>
>> I had published kprobes patches for ARM64:
>> http://lwn.net/Articles/570648/  and using your patcset (v3) for
>> patching support, it works so far.
>> I CCed you on my RFC but unfortunately to your huawei email not the gmail.
>>
>> I can give a try with kick_all_cpus_sync but wanted to understand this
>> a bit  detail on hows different from stop_machine and how this work.
> 
> My point was just that for nosync patching, the update to the instruction
> stream is atomic with respect to instruction fetch, so stop_machine seems a
> bit overkill. kick_all_cpus can be used to ensure visibility of the new
> instruction.
> 
> Jiang Liu seemed to imply that this isn't suitable for kprobes, but I would
> like to know if/why that is the case.
> 
> Will
> 
Hi Will and Sandeepa,
	Seems some misunderstanding here. We provide three interfaces
here.
1) aarch64_insn_patch_text_nosync(), which patches kernel/module text
without explicitly synchronization. It may be used in cases of
1.a) There's only one CPU running during early boot stage
1.b) All other CPU has been put into safe state in kgdb.
1.c) The instructions before and after patching are both hot-patch safe.

2) aarch64_insn_patch_text_sync(), which patches kernel/module text
with explicitly synchronization by stop_machine() mechanism. It may
be used to support kprobe because kprobe may patch multiple and/or
non-hotpatch safe instructions at runtime, so we need stop_machine()
for synchronization.

3) aarch64_insn_patch_text() intelligently choose
aarch64_insn_patch_text_nosync() or aarch64_insn_patch_text_sync().

So for kprobe, I think need to use stop_machine() for synchronization.

Thanks!
Gerry

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v5 4/7] arm64: introduce aarch64_insn_gen_{nop|branch_imm}() helper functions
  2013-10-30  0:48     ` Will Deacon
@ 2013-11-06 16:31       ` Jiang Liu
  -1 siblings, 0 replies; 54+ messages in thread
From: Jiang Liu @ 2013-11-06 16:31 UTC (permalink / raw)
  To: Will Deacon
  Cc: Steven Rostedt, Catalin Marinas, Sandeepa Prabhu, Jiang Liu,
	linux-arm-kernel, linux-kernel

On 10/30/2013 08:48 AM, Will Deacon wrote:
> On Fri, Oct 18, 2013 at 04:19:58PM +0100, Jiang Liu wrote:
>> From: Jiang Liu <jiang.liu@huawei.com>
>>
>> Introduce aarch64_insn_gen_{nop|branch_imm}() helper functions, which
>> will be used to implement jump label on ARM64.
>>
>> Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
>> Cc: Jiang Liu <liuj97@gmail.com>
>> ---
>>  arch/arm64/include/asm/insn.h | 13 +++++++++++++
>>  arch/arm64/kernel/insn.c      | 28 ++++++++++++++++++++++++++++
>>  2 files changed, 41 insertions(+)
>>
>> diff --git a/arch/arm64/include/asm/insn.h b/arch/arm64/include/asm/insn.h
>> index 8f94e48..ee4a60d 100644
>> --- a/arch/arm64/include/asm/insn.h
>> +++ b/arch/arm64/include/asm/insn.h
>> @@ -96,9 +96,22 @@ static __always_inline void aarch64_insn_write(void *addr, u32 insn)
>>  	*(u32 *)addr = cpu_to_le32(insn);
>>  }
>>  
>> +static __always_inline u32 aarch64_insn_gen_hint(enum aarch64_insn_hint_op op)
>> +{
>> +	return aarch64_insn_get_hint_value() | op;
> 
> What's the point in that helper function? Does it just return an immediate
> value (and are there are other users of it?).
It just returns a const value. The helper function is just for ease
maintainence.

> 
>> +}
>> +
>> +static __always_inline u32 aarch64_insn_gen_nop(void)
>> +{
>> +	return aarch64_insn_gen_hint(AARCH64_INSN_HINT_NOP);
>> +}
> 
> Either use plain old `inline' or write these as preprocessor macros.
Will change to normal (non-inline) and add __kprobe property.

> 
>> +
>>  enum aarch64_insn_encoding_class aarch64_get_insn_class(u32 insn);
>>  u32 aarch64_insn_encode_immediate(enum aarch64_insn_imm_type type,
>>  				  u32 insn, u64 imm);
>> +u32 aarch64_insn_gen_branch_imm(unsigned long pc, unsigned long addr,
>> +				bool link);
>> +
>>  bool aarch64_insn_hotpatch_safe(u32 old_insn, u32 new_insn);
>>  
>>  int aarch64_insn_patch_text_nosync(void *addr, u32 insn);
>> diff --git a/arch/arm64/kernel/insn.c b/arch/arm64/kernel/insn.c
>> index 886821f..f7498cc 100644
>> --- a/arch/arm64/kernel/insn.c
>> +++ b/arch/arm64/kernel/insn.c
>> @@ -14,6 +14,7 @@
>>   * You should have received a copy of the GNU General Public License
>>   * along with this program.  If not, see <http://www.gnu.org/licenses/>.
>>   */
>> +#include <linux/bitops.h>
>>  #include <linux/compiler.h>
>>  #include <linux/kernel.h>
>>  #include <linux/smp.h>
>> @@ -233,3 +234,30 @@ u32 aarch64_insn_encode_immediate(enum aarch64_insn_imm_type type,
>>  
>>  	return insn;
>>  }
>> +
>> +u32 aarch64_insn_gen_branch_imm(unsigned long pc, unsigned long addr, bool link)
>> +{
>> +	u32 insn;
>> +	long offset;
>> +
>> +	/*
>> +	 * PC: A 64-bit Program Counter holding the address of the current
>> +	 * instruction. A64 instructions may be word-aligned.
>> +	 */
>> +	BUG_ON((pc & 0x3) || (addr & 0x3));
>> +
>> +	/*
>> +	 * B/BL support [-128M, 128M) offset
>> +	 * ARM64 virtual address arrangement garantees all kernel and module
> 
> Typo: `guarantees'.
Thanks, will fix it.

> 
>> +	 * texts are within +/-128M.
>> +	 */
>> +	offset = ((long)addr - (long)pc) >> 2;
>> +	BUG_ON(abs(offset) > BIT(25) || offset == BIT(25));
> 
> I really struggle to follow this range checking. Why don't you just compare
> the absolute difference between addr and pc with SZ_128M?
Will change code to follow your suggestion.

> 
> Will
> 


^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH v5 4/7] arm64: introduce aarch64_insn_gen_{nop|branch_imm}() helper functions
@ 2013-11-06 16:31       ` Jiang Liu
  0 siblings, 0 replies; 54+ messages in thread
From: Jiang Liu @ 2013-11-06 16:31 UTC (permalink / raw)
  To: linux-arm-kernel

On 10/30/2013 08:48 AM, Will Deacon wrote:
> On Fri, Oct 18, 2013 at 04:19:58PM +0100, Jiang Liu wrote:
>> From: Jiang Liu <jiang.liu@huawei.com>
>>
>> Introduce aarch64_insn_gen_{nop|branch_imm}() helper functions, which
>> will be used to implement jump label on ARM64.
>>
>> Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
>> Cc: Jiang Liu <liuj97@gmail.com>
>> ---
>>  arch/arm64/include/asm/insn.h | 13 +++++++++++++
>>  arch/arm64/kernel/insn.c      | 28 ++++++++++++++++++++++++++++
>>  2 files changed, 41 insertions(+)
>>
>> diff --git a/arch/arm64/include/asm/insn.h b/arch/arm64/include/asm/insn.h
>> index 8f94e48..ee4a60d 100644
>> --- a/arch/arm64/include/asm/insn.h
>> +++ b/arch/arm64/include/asm/insn.h
>> @@ -96,9 +96,22 @@ static __always_inline void aarch64_insn_write(void *addr, u32 insn)
>>  	*(u32 *)addr = cpu_to_le32(insn);
>>  }
>>  
>> +static __always_inline u32 aarch64_insn_gen_hint(enum aarch64_insn_hint_op op)
>> +{
>> +	return aarch64_insn_get_hint_value() | op;
> 
> What's the point in that helper function? Does it just return an immediate
> value (and are there are other users of it?).
It just returns a const value. The helper function is just for ease
maintainence.

> 
>> +}
>> +
>> +static __always_inline u32 aarch64_insn_gen_nop(void)
>> +{
>> +	return aarch64_insn_gen_hint(AARCH64_INSN_HINT_NOP);
>> +}
> 
> Either use plain old `inline' or write these as preprocessor macros.
Will change to normal (non-inline) and add __kprobe property.

> 
>> +
>>  enum aarch64_insn_encoding_class aarch64_get_insn_class(u32 insn);
>>  u32 aarch64_insn_encode_immediate(enum aarch64_insn_imm_type type,
>>  				  u32 insn, u64 imm);
>> +u32 aarch64_insn_gen_branch_imm(unsigned long pc, unsigned long addr,
>> +				bool link);
>> +
>>  bool aarch64_insn_hotpatch_safe(u32 old_insn, u32 new_insn);
>>  
>>  int aarch64_insn_patch_text_nosync(void *addr, u32 insn);
>> diff --git a/arch/arm64/kernel/insn.c b/arch/arm64/kernel/insn.c
>> index 886821f..f7498cc 100644
>> --- a/arch/arm64/kernel/insn.c
>> +++ b/arch/arm64/kernel/insn.c
>> @@ -14,6 +14,7 @@
>>   * You should have received a copy of the GNU General Public License
>>   * along with this program.  If not, see <http://www.gnu.org/licenses/>.
>>   */
>> +#include <linux/bitops.h>
>>  #include <linux/compiler.h>
>>  #include <linux/kernel.h>
>>  #include <linux/smp.h>
>> @@ -233,3 +234,30 @@ u32 aarch64_insn_encode_immediate(enum aarch64_insn_imm_type type,
>>  
>>  	return insn;
>>  }
>> +
>> +u32 aarch64_insn_gen_branch_imm(unsigned long pc, unsigned long addr, bool link)
>> +{
>> +	u32 insn;
>> +	long offset;
>> +
>> +	/*
>> +	 * PC: A 64-bit Program Counter holding the address of the current
>> +	 * instruction. A64 instructions may be word-aligned.
>> +	 */
>> +	BUG_ON((pc & 0x3) || (addr & 0x3));
>> +
>> +	/*
>> +	 * B/BL support [-128M, 128M) offset
>> +	 * ARM64 virtual address arrangement garantees all kernel and module
> 
> Typo: `guarantees'.
Thanks, will fix it.

> 
>> +	 * texts are within +/-128M.
>> +	 */
>> +	offset = ((long)addr - (long)pc) >> 2;
>> +	BUG_ON(abs(offset) > BIT(25) || offset == BIT(25));
> 
> I really struggle to follow this range checking. Why don't you just compare
> the absolute difference between addr and pc with SZ_128M?
Will change code to follow your suggestion.

> 
> Will
> 

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v5 5/7] arm64, jump label: detect %c support for ARM64
  2013-10-30  0:49     ` Will Deacon
@ 2013-11-06 16:32       ` Jiang Liu
  -1 siblings, 0 replies; 54+ messages in thread
From: Jiang Liu @ 2013-11-06 16:32 UTC (permalink / raw)
  To: Will Deacon
  Cc: Steven Rostedt, Catalin Marinas, Sandeepa Prabhu, linux-kernel,
	linux-arm-kernel

On 10/30/2013 08:49 AM, Will Deacon wrote:
> On Fri, Oct 18, 2013 at 04:19:59PM +0100, Jiang Liu wrote:
>> From: Jiang Liu <jiang.liu@huawei.com>
>>
>> As commit a9468f30b5eac6 "ARM: 7333/2: jump label: detect %c
>> support for ARM", this patch detects the same thing for ARM64
>> because some ARM64 GCC versions have the same issue.
>>
>> Some versions of ARM64 GCC which do support asm goto, do not
>> support the %c specifier. Since we need the %c to support jump
>> labels on ARM64, detect that too in the asm goto detection script
>> to avoid build errors with these versions.
> 
> Acked-by: Will Deacon <will.deacon@arm.com>
> 
> The toolchain is being fixed upstream for this, but we should have the check
> for older versions.
> 
> Will
> 
Thanks for review!


^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH v5 5/7] arm64, jump label: detect %c support for ARM64
@ 2013-11-06 16:32       ` Jiang Liu
  0 siblings, 0 replies; 54+ messages in thread
From: Jiang Liu @ 2013-11-06 16:32 UTC (permalink / raw)
  To: linux-arm-kernel

On 10/30/2013 08:49 AM, Will Deacon wrote:
> On Fri, Oct 18, 2013 at 04:19:59PM +0100, Jiang Liu wrote:
>> From: Jiang Liu <jiang.liu@huawei.com>
>>
>> As commit a9468f30b5eac6 "ARM: 7333/2: jump label: detect %c
>> support for ARM", this patch detects the same thing for ARM64
>> because some ARM64 GCC versions have the same issue.
>>
>> Some versions of ARM64 GCC which do support asm goto, do not
>> support the %c specifier. Since we need the %c to support jump
>> labels on ARM64, detect that too in the asm goto detection script
>> to avoid build errors with these versions.
> 
> Acked-by: Will Deacon <will.deacon@arm.com>
> 
> The toolchain is being fixed upstream for this, but we should have the check
> for older versions.
> 
> Will
> 
Thanks for review!

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v5 6/7] arm64, jump label: optimize jump label implementation
  2013-10-30  0:55     ` Will Deacon
@ 2013-11-06 16:38       ` Jiang Liu
  -1 siblings, 0 replies; 54+ messages in thread
From: Jiang Liu @ 2013-11-06 16:38 UTC (permalink / raw)
  To: Will Deacon
  Cc: Steven Rostedt, Catalin Marinas, Sandeepa Prabhu, Marc Zyngier,
	Arnd Bergmann, linux-arm-kernel, linux-kernel

On 10/30/2013 08:55 AM, Will Deacon wrote:
> On Fri, Oct 18, 2013 at 04:20:00PM +0100, Jiang Liu wrote:
>> From: Jiang Liu <jiang.liu@huawei.com>
>>
>> Optimize jump label implementation for ARM64 by dynamically patching
>> kernel text.
>>
>> Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
>> Cc: Jiang Liu <liuj97@gmail.com>
>> ---
>>  arch/arm64/Kconfig                  |  1 +
>>  arch/arm64/include/asm/jump_label.h | 52 +++++++++++++++++++++++++++++++++
>>  arch/arm64/kernel/Makefile          |  1 +
>>  arch/arm64/kernel/jump_label.c      | 58 +++++++++++++++++++++++++++++++++++++
>>  4 files changed, 112 insertions(+)
>>  create mode 100644 arch/arm64/include/asm/jump_label.h
>>  create mode 100644 arch/arm64/kernel/jump_label.c
>>
>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>> index c044548..da388e4 100644
>> --- a/arch/arm64/Kconfig
>> +++ b/arch/arm64/Kconfig
>> @@ -17,6 +17,7 @@ config ARM64
>>  	select GENERIC_SMP_IDLE_THREAD
>>  	select GENERIC_TIME_VSYSCALL
>>  	select HARDIRQS_SW_RESEND
>> +	select HAVE_ARCH_JUMP_LABEL
>>  	select HAVE_ARCH_TRACEHOOK
>>  	select HAVE_DEBUG_BUGVERBOSE
>>  	select HAVE_DEBUG_KMEMLEAK
>> diff --git a/arch/arm64/include/asm/jump_label.h b/arch/arm64/include/asm/jump_label.h
>> new file mode 100644
>> index 0000000..d268fab
>> --- /dev/null
>> +++ b/arch/arm64/include/asm/jump_label.h
>> @@ -0,0 +1,52 @@
>> +/*
>> + * Copyright (C) 2013 Huawei Ltd.
>> + * Author: Jiang Liu <jiang.liu@huawei.com>
>> + *
>> + * Based on arch/arm/include/asm/jump_label.h
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License version 2 as
>> + * published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + *
>> + * You should have received a copy of the GNU General Public License
>> + * along with this program.  If not, see <http://www.gnu.org/licenses/>.
>> + */
>> +#ifndef _ASM_ARM64_JUMP_LABEL_H
>> +#define _ASM_ARM64_JUMP_LABEL_H
>> +#include <linux/types.h>
>> +
>> +#ifdef __KERNEL__
>> +
>> +#define JUMP_LABEL_NOP_SIZE 4
>> +
>> +static __always_inline bool arch_static_branch(struct static_key *key)
>> +{
>> +	asm goto("1:\n\t"
>> +		 "nop\n\t"
> 
> Minor nit: can you put the label on the same line as the nop please?
Sure, will refine it.

> 
>> +		 ".pushsection __jump_table,  \"aw\"\n\t"
>> +		 ".align 3\n\t"
>> +		 ".quad 1b, %l[l_yes], %c0\n\t"
>> +		 ".popsection\n\t"
>> +		 :  :  "i"(key) :  : l_yes);
>> +
>> +	return false;
>> +l_yes:
>> +	return true;
>> +}
>> +
>> +#endif /* __KERNEL__ */
>> +
>> +typedef u64 jump_label_t;
>> +
>> +struct jump_entry {
>> +	jump_label_t code;
>> +	jump_label_t target;
>> +	jump_label_t key;
>> +};
>> +
>> +#endif	/* _ASM_ARM64_JUMP_LABEL_H */
>> diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
>> index 9af6cb3..b7db65e 100644
>> --- a/arch/arm64/kernel/Makefile
>> +++ b/arch/arm64/kernel/Makefile
>> @@ -18,6 +18,7 @@ arm64-obj-$(CONFIG_SMP)			+= smp.o smp_spin_table.o smp_psci.o
>>  arm64-obj-$(CONFIG_HW_PERF_EVENTS)	+= perf_event.o
>>  arm64-obj-$(CONFIG_HAVE_HW_BREAKPOINT)+= hw_breakpoint.o
>>  arm64-obj-$(CONFIG_EARLY_PRINTK)	+= early_printk.o
>> +arm64-obj-$(CONFIG_JUMP_LABEL)		+= jump_label.o
>>  
>>  obj-y					+= $(arm64-obj-y) vdso/
>>  obj-m					+= $(arm64-obj-m)
>> diff --git a/arch/arm64/kernel/jump_label.c b/arch/arm64/kernel/jump_label.c
>> new file mode 100644
>> index 0000000..871786a
>> --- /dev/null
>> +++ b/arch/arm64/kernel/jump_label.c
>> @@ -0,0 +1,58 @@
>> +/*
>> + * Copyright (C) 2013 Huawei Ltd.
>> + * Author: Jiang Liu <jiang.liu@huawei.com>
>> + *
>> + * Based on arch/arm/kernel/jump_label.c
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License version 2 as
>> + * published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + *
>> + * You should have received a copy of the GNU General Public License
>> + * along with this program.  If not, see <http://www.gnu.org/licenses/>.
>> + */
>> +#include <linux/kernel.h>
>> +#include <linux/jump_label.h>
>> +#include <asm/jump_label.h>
>> +#include <asm/insn.h>
>> +
>> +#ifdef HAVE_JUMP_LABEL
>> +
>> +static void __arch_jump_label_transform(struct jump_entry *entry,
>> +					enum jump_label_type type,
>> +					bool is_static)
>> +{
>> +	void *addr = (void *)entry->code;
>> +	u32 insn;
>> +
>> +	if (type == JUMP_LABEL_ENABLE) {
>> +		insn = aarch64_insn_gen_branch_imm(entry->code,
>> +						   entry->target, 0);
> 
> Maybe have an enum or #define for the last parameter -- it's not at all
> obvious that '0' means `no link'.
Thanks, will improve it.

> 
>> +	} else {
>> +		insn = aarch64_insn_gen_nop();
>> +	}
>> +
>> +	if (is_static)
>> +		aarch64_insn_patch_text_nosync(&addr, &insn, 1);
> 
> I think I mentioned this before, but is it ever safe to patch more than one
> instruction in the _nosync case? If not, I wouldn't even bother with that
> last argument.
I originally plan to optimize jump_label implementation with the third
parameter. But for now, I will remove the third parameter to make code
simple and add them back in future when needed.
Thanks!

> 
> Will
> 


^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH v5 6/7] arm64, jump label: optimize jump label implementation
@ 2013-11-06 16:38       ` Jiang Liu
  0 siblings, 0 replies; 54+ messages in thread
From: Jiang Liu @ 2013-11-06 16:38 UTC (permalink / raw)
  To: linux-arm-kernel

On 10/30/2013 08:55 AM, Will Deacon wrote:
> On Fri, Oct 18, 2013 at 04:20:00PM +0100, Jiang Liu wrote:
>> From: Jiang Liu <jiang.liu@huawei.com>
>>
>> Optimize jump label implementation for ARM64 by dynamically patching
>> kernel text.
>>
>> Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
>> Cc: Jiang Liu <liuj97@gmail.com>
>> ---
>>  arch/arm64/Kconfig                  |  1 +
>>  arch/arm64/include/asm/jump_label.h | 52 +++++++++++++++++++++++++++++++++
>>  arch/arm64/kernel/Makefile          |  1 +
>>  arch/arm64/kernel/jump_label.c      | 58 +++++++++++++++++++++++++++++++++++++
>>  4 files changed, 112 insertions(+)
>>  create mode 100644 arch/arm64/include/asm/jump_label.h
>>  create mode 100644 arch/arm64/kernel/jump_label.c
>>
>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>> index c044548..da388e4 100644
>> --- a/arch/arm64/Kconfig
>> +++ b/arch/arm64/Kconfig
>> @@ -17,6 +17,7 @@ config ARM64
>>  	select GENERIC_SMP_IDLE_THREAD
>>  	select GENERIC_TIME_VSYSCALL
>>  	select HARDIRQS_SW_RESEND
>> +	select HAVE_ARCH_JUMP_LABEL
>>  	select HAVE_ARCH_TRACEHOOK
>>  	select HAVE_DEBUG_BUGVERBOSE
>>  	select HAVE_DEBUG_KMEMLEAK
>> diff --git a/arch/arm64/include/asm/jump_label.h b/arch/arm64/include/asm/jump_label.h
>> new file mode 100644
>> index 0000000..d268fab
>> --- /dev/null
>> +++ b/arch/arm64/include/asm/jump_label.h
>> @@ -0,0 +1,52 @@
>> +/*
>> + * Copyright (C) 2013 Huawei Ltd.
>> + * Author: Jiang Liu <jiang.liu@huawei.com>
>> + *
>> + * Based on arch/arm/include/asm/jump_label.h
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License version 2 as
>> + * published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + *
>> + * You should have received a copy of the GNU General Public License
>> + * along with this program.  If not, see <http://www.gnu.org/licenses/>.
>> + */
>> +#ifndef _ASM_ARM64_JUMP_LABEL_H
>> +#define _ASM_ARM64_JUMP_LABEL_H
>> +#include <linux/types.h>
>> +
>> +#ifdef __KERNEL__
>> +
>> +#define JUMP_LABEL_NOP_SIZE 4
>> +
>> +static __always_inline bool arch_static_branch(struct static_key *key)
>> +{
>> +	asm goto("1:\n\t"
>> +		 "nop\n\t"
> 
> Minor nit: can you put the label on the same line as the nop please?
Sure, will refine it.

> 
>> +		 ".pushsection __jump_table,  \"aw\"\n\t"
>> +		 ".align 3\n\t"
>> +		 ".quad 1b, %l[l_yes], %c0\n\t"
>> +		 ".popsection\n\t"
>> +		 :  :  "i"(key) :  : l_yes);
>> +
>> +	return false;
>> +l_yes:
>> +	return true;
>> +}
>> +
>> +#endif /* __KERNEL__ */
>> +
>> +typedef u64 jump_label_t;
>> +
>> +struct jump_entry {
>> +	jump_label_t code;
>> +	jump_label_t target;
>> +	jump_label_t key;
>> +};
>> +
>> +#endif	/* _ASM_ARM64_JUMP_LABEL_H */
>> diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
>> index 9af6cb3..b7db65e 100644
>> --- a/arch/arm64/kernel/Makefile
>> +++ b/arch/arm64/kernel/Makefile
>> @@ -18,6 +18,7 @@ arm64-obj-$(CONFIG_SMP)			+= smp.o smp_spin_table.o smp_psci.o
>>  arm64-obj-$(CONFIG_HW_PERF_EVENTS)	+= perf_event.o
>>  arm64-obj-$(CONFIG_HAVE_HW_BREAKPOINT)+= hw_breakpoint.o
>>  arm64-obj-$(CONFIG_EARLY_PRINTK)	+= early_printk.o
>> +arm64-obj-$(CONFIG_JUMP_LABEL)		+= jump_label.o
>>  
>>  obj-y					+= $(arm64-obj-y) vdso/
>>  obj-m					+= $(arm64-obj-m)
>> diff --git a/arch/arm64/kernel/jump_label.c b/arch/arm64/kernel/jump_label.c
>> new file mode 100644
>> index 0000000..871786a
>> --- /dev/null
>> +++ b/arch/arm64/kernel/jump_label.c
>> @@ -0,0 +1,58 @@
>> +/*
>> + * Copyright (C) 2013 Huawei Ltd.
>> + * Author: Jiang Liu <jiang.liu@huawei.com>
>> + *
>> + * Based on arch/arm/kernel/jump_label.c
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License version 2 as
>> + * published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + *
>> + * You should have received a copy of the GNU General Public License
>> + * along with this program.  If not, see <http://www.gnu.org/licenses/>.
>> + */
>> +#include <linux/kernel.h>
>> +#include <linux/jump_label.h>
>> +#include <asm/jump_label.h>
>> +#include <asm/insn.h>
>> +
>> +#ifdef HAVE_JUMP_LABEL
>> +
>> +static void __arch_jump_label_transform(struct jump_entry *entry,
>> +					enum jump_label_type type,
>> +					bool is_static)
>> +{
>> +	void *addr = (void *)entry->code;
>> +	u32 insn;
>> +
>> +	if (type == JUMP_LABEL_ENABLE) {
>> +		insn = aarch64_insn_gen_branch_imm(entry->code,
>> +						   entry->target, 0);
> 
> Maybe have an enum or #define for the last parameter -- it's not at all
> obvious that '0' means `no link'.
Thanks, will improve it.

> 
>> +	} else {
>> +		insn = aarch64_insn_gen_nop();
>> +	}
>> +
>> +	if (is_static)
>> +		aarch64_insn_patch_text_nosync(&addr, &insn, 1);
> 
> I think I mentioned this before, but is it ever safe to patch more than one
> instruction in the _nosync case? If not, I wouldn't even bother with that
> last argument.
I originally plan to optimize jump_label implementation with the third
parameter. But for now, I will remove the third parameter to make code
simple and add them back in future when needed.
Thanks!

> 
> Will
> 

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v5 4/7] arm64: introduce aarch64_insn_gen_{nop|branch_imm}() helper functions
  2013-10-30  0:48     ` Will Deacon
@ 2013-11-06 16:45       ` Steven Rostedt
  -1 siblings, 0 replies; 54+ messages in thread
From: Steven Rostedt @ 2013-11-06 16:45 UTC (permalink / raw)
  To: Will Deacon
  Cc: Jiang Liu, Catalin Marinas, Sandeepa Prabhu, Jiang Liu,
	linux-arm-kernel, linux-kernel

On Wed, 30 Oct 2013 00:48:40 +0000
Will Deacon <will.deacon@arm.com> wrote:


> > +}
> > +
> > +static __always_inline u32 aarch64_insn_gen_nop(void)
> > +{
> > +	return aarch64_insn_gen_hint(AARCH64_INSN_HINT_NOP);
> > +}
> 
> Either use plain old `inline' or write these as preprocessor macros.
> 

I'm curious to why you say that? Preprocessor macros are rather ugly,
and in x86 we use "__always_inline" quite liberally.

-- Steve

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH v5 4/7] arm64: introduce aarch64_insn_gen_{nop|branch_imm}() helper functions
@ 2013-11-06 16:45       ` Steven Rostedt
  0 siblings, 0 replies; 54+ messages in thread
From: Steven Rostedt @ 2013-11-06 16:45 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 30 Oct 2013 00:48:40 +0000
Will Deacon <will.deacon@arm.com> wrote:


> > +}
> > +
> > +static __always_inline u32 aarch64_insn_gen_nop(void)
> > +{
> > +	return aarch64_insn_gen_hint(AARCH64_INSN_HINT_NOP);
> > +}
> 
> Either use plain old `inline' or write these as preprocessor macros.
> 

I'm curious to why you say that? Preprocessor macros are rather ugly,
and in x86 we use "__always_inline" quite liberally.

-- Steve

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v5 4/7] arm64: introduce aarch64_insn_gen_{nop|branch_imm}() helper functions
  2013-11-06 16:45       ` Steven Rostedt
@ 2013-11-06 18:02         ` Will Deacon
  -1 siblings, 0 replies; 54+ messages in thread
From: Will Deacon @ 2013-11-06 18:02 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Jiang Liu, Catalin Marinas, Sandeepa Prabhu, Jiang Liu,
	linux-arm-kernel, linux-kernel

On Wed, Nov 06, 2013 at 04:45:30PM +0000, Steven Rostedt wrote:
> On Wed, 30 Oct 2013 00:48:40 +0000
> Will Deacon <will.deacon@arm.com> wrote:
> 
> 
> > > +}
> > > +
> > > +static __always_inline u32 aarch64_insn_gen_nop(void)
> > > +{
> > > +	return aarch64_insn_gen_hint(AARCH64_INSN_HINT_NOP);
> > > +}
> > 
> > Either use plain old `inline' or write these as preprocessor macros.
> > 
> 
> I'm curious to why you say that? Preprocessor macros are rather ugly,
> and in x86 we use "__always_inline" quite liberally.

I can understand why you might use __always_inline over the preprocessor
when you *really* need something inlined, but in this case I don't see why
`inline' isn't sufficient. It's just a cosmetic issue, really.

Will

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH v5 4/7] arm64: introduce aarch64_insn_gen_{nop|branch_imm}() helper functions
@ 2013-11-06 18:02         ` Will Deacon
  0 siblings, 0 replies; 54+ messages in thread
From: Will Deacon @ 2013-11-06 18:02 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Nov 06, 2013 at 04:45:30PM +0000, Steven Rostedt wrote:
> On Wed, 30 Oct 2013 00:48:40 +0000
> Will Deacon <will.deacon@arm.com> wrote:
> 
> 
> > > +}
> > > +
> > > +static __always_inline u32 aarch64_insn_gen_nop(void)
> > > +{
> > > +	return aarch64_insn_gen_hint(AARCH64_INSN_HINT_NOP);
> > > +}
> > 
> > Either use plain old `inline' or write these as preprocessor macros.
> > 
> 
> I'm curious to why you say that? Preprocessor macros are rather ugly,
> and in x86 we use "__always_inline" quite liberally.

I can understand why you might use __always_inline over the preprocessor
when you *really* need something inlined, but in this case I don't see why
`inline' isn't sufficient. It's just a cosmetic issue, really.

Will

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v5 2/7] arm64: introduce interfaces to hotpatch kernel and module code
  2013-11-06 16:12             ` Jiang Liu
@ 2013-11-06 18:09               ` Will Deacon
  -1 siblings, 0 replies; 54+ messages in thread
From: Will Deacon @ 2013-11-06 18:09 UTC (permalink / raw)
  To: Jiang Liu
  Cc: Sandeepa Prabhu, Steven Rostedt, Catalin Marinas, Jiang Liu,
	linux-arm-kernel, linux-kernel

On Wed, Nov 06, 2013 at 04:12:13PM +0000, Jiang Liu wrote:
> On 11/06/2013 06:41 PM, Will Deacon wrote:
> > On Mon, Nov 04, 2013 at 03:12:56PM +0000, Sandeepa Prabhu wrote:
> >> On 3 November 2013 23:55, Jiang Liu <liuj97@gmail.com> wrote:
> >>> On 10/30/2013 08:12 AM, Will Deacon wrote:
> >>>> On Fri, Oct 18, 2013 at 04:19:56PM +0100, Jiang Liu wrote:
> >>>>> +    atomic_set(&text_patch_id, smp_processor_id());
> >>>>> +    ret = stop_machine(aarch64_insn_patch_text_cb, &patch, cpu_online_mask);
> >>>>
> >>>> Instead of doing this, why not instead call aarch64_insn_patch_text_nosync
> >>>> inline, then call kick_all_cpus_sync immediately afterwards, without the
> >>>> need to stop_machine.
> >>> Sandeepa, who is working on kprobe for ARM64, needs the stop_machine()
> >>> mechanism to synchronize all online CPUs, so it's a preparation for
> >>> kprobe.
> >>
> >> I had published kprobes patches for ARM64:
> >> http://lwn.net/Articles/570648/  and using your patcset (v3) for
> >> patching support, it works so far.
> >> I CCed you on my RFC but unfortunately to your huawei email not the gmail.
> >>
> >> I can give a try with kick_all_cpus_sync but wanted to understand this
> >> a bit  detail on hows different from stop_machine and how this work.
> > 
> > My point was just that for nosync patching, the update to the instruction
> > stream is atomic with respect to instruction fetch, so stop_machine seems a
> > bit overkill. kick_all_cpus can be used to ensure visibility of the new
> > instruction.
> > 
> > Jiang Liu seemed to imply that this isn't suitable for kprobes, but I would
> > like to know if/why that is the case.
> > 
> > Will
> > 
> Hi Will and Sandeepa,
> 	Seems some misunderstanding here. We provide three interfaces
> here.
> 1) aarch64_insn_patch_text_nosync(), which patches kernel/module text
> without explicitly synchronization. It may be used in cases of
> 1.a) There's only one CPU running during early boot stage
> 1.b) All other CPU has been put into safe state in kgdb.
> 1.c) The instructions before and after patching are both hot-patch safe.
> 
> 2) aarch64_insn_patch_text_sync(), which patches kernel/module text
> with explicitly synchronization by stop_machine() mechanism. It may
> be used to support kprobe because kprobe may patch multiple and/or
> non-hotpatch safe instructions at runtime, so we need stop_machine()
> for synchronization.
> 
> 3) aarch64_insn_patch_text() intelligently choose
> aarch64_insn_patch_text_nosync() or aarch64_insn_patch_text_sync().
> 
> So for kprobe, I think need to use stop_machine() for synchronization.

Yup that sounds right. I think I must've got confused as to what exactly was
being patched and the interaction with the mutex; if we're patching live
code which isn't hot-patch safe, then stop_machine sounds like the thing to
do.

Sorry for the confusion,

Will

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH v5 2/7] arm64: introduce interfaces to hotpatch kernel and module code
@ 2013-11-06 18:09               ` Will Deacon
  0 siblings, 0 replies; 54+ messages in thread
From: Will Deacon @ 2013-11-06 18:09 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Nov 06, 2013 at 04:12:13PM +0000, Jiang Liu wrote:
> On 11/06/2013 06:41 PM, Will Deacon wrote:
> > On Mon, Nov 04, 2013 at 03:12:56PM +0000, Sandeepa Prabhu wrote:
> >> On 3 November 2013 23:55, Jiang Liu <liuj97@gmail.com> wrote:
> >>> On 10/30/2013 08:12 AM, Will Deacon wrote:
> >>>> On Fri, Oct 18, 2013 at 04:19:56PM +0100, Jiang Liu wrote:
> >>>>> +    atomic_set(&text_patch_id, smp_processor_id());
> >>>>> +    ret = stop_machine(aarch64_insn_patch_text_cb, &patch, cpu_online_mask);
> >>>>
> >>>> Instead of doing this, why not instead call aarch64_insn_patch_text_nosync
> >>>> inline, then call kick_all_cpus_sync immediately afterwards, without the
> >>>> need to stop_machine.
> >>> Sandeepa, who is working on kprobe for ARM64, needs the stop_machine()
> >>> mechanism to synchronize all online CPUs, so it's a preparation for
> >>> kprobe.
> >>
> >> I had published kprobes patches for ARM64:
> >> http://lwn.net/Articles/570648/  and using your patcset (v3) for
> >> patching support, it works so far.
> >> I CCed you on my RFC but unfortunately to your huawei email not the gmail.
> >>
> >> I can give a try with kick_all_cpus_sync but wanted to understand this
> >> a bit  detail on hows different from stop_machine and how this work.
> > 
> > My point was just that for nosync patching, the update to the instruction
> > stream is atomic with respect to instruction fetch, so stop_machine seems a
> > bit overkill. kick_all_cpus can be used to ensure visibility of the new
> > instruction.
> > 
> > Jiang Liu seemed to imply that this isn't suitable for kprobes, but I would
> > like to know if/why that is the case.
> > 
> > Will
> > 
> Hi Will and Sandeepa,
> 	Seems some misunderstanding here. We provide three interfaces
> here.
> 1) aarch64_insn_patch_text_nosync(), which patches kernel/module text
> without explicitly synchronization. It may be used in cases of
> 1.a) There's only one CPU running during early boot stage
> 1.b) All other CPU has been put into safe state in kgdb.
> 1.c) The instructions before and after patching are both hot-patch safe.
> 
> 2) aarch64_insn_patch_text_sync(), which patches kernel/module text
> with explicitly synchronization by stop_machine() mechanism. It may
> be used to support kprobe because kprobe may patch multiple and/or
> non-hotpatch safe instructions at runtime, so we need stop_machine()
> for synchronization.
> 
> 3) aarch64_insn_patch_text() intelligently choose
> aarch64_insn_patch_text_nosync() or aarch64_insn_patch_text_sync().
> 
> So for kprobe, I think need to use stop_machine() for synchronization.

Yup that sounds right. I think I must've got confused as to what exactly was
being patched and the interaction with the mutex; if we're patching live
code which isn't hot-patch safe, then stop_machine sounds like the thing to
do.

Sorry for the confusion,

Will

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v5 2/7] arm64: introduce interfaces to hotpatch kernel and module code
  2013-11-03 15:55       ` Jiang Liu
@ 2013-11-27 12:20         ` Will Deacon
  -1 siblings, 0 replies; 54+ messages in thread
From: Will Deacon @ 2013-11-27 12:20 UTC (permalink / raw)
  To: Jiang Liu
  Cc: Steven Rostedt, Catalin Marinas, Sandeepa Prabhu, Jiang Liu,
	linux-arm-kernel, linux-kernel

Hi Jiang Liu,

On Sun, Nov 03, 2013 at 03:55:21PM +0000, Jiang Liu wrote:
> On 10/30/2013 08:12 AM, Will Deacon wrote:
> > If I try and email you at your Huawei address, I get a bounce from the mail
> > server. Is that expected? If so, it's not very helpful from a commit log
> > perspective if you use that address as the author on your patches.
> > 
> Hi Will,
> 	Sorry for the inconvenience. I have left Huawei recently and
> have had a vacation last two weeks. I will use my gmail account next
> time.

I appreciate that you're probably not being paid to look at this anymore,
but have you had a chance to address the comments I made on version 5 of
this series? As far as I'm concerned, this is almost ready for merging and
I'd love to see it get in for 3.14!

It would be a real shame if this ended up getting dropped...

Cheers,

Will

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH v5 2/7] arm64: introduce interfaces to hotpatch kernel and module code
@ 2013-11-27 12:20         ` Will Deacon
  0 siblings, 0 replies; 54+ messages in thread
From: Will Deacon @ 2013-11-27 12:20 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Jiang Liu,

On Sun, Nov 03, 2013 at 03:55:21PM +0000, Jiang Liu wrote:
> On 10/30/2013 08:12 AM, Will Deacon wrote:
> > If I try and email you at your Huawei address, I get a bounce from the mail
> > server. Is that expected? If so, it's not very helpful from a commit log
> > perspective if you use that address as the author on your patches.
> > 
> Hi Will,
> 	Sorry for the inconvenience. I have left Huawei recently and
> have had a vacation last two weeks. I will use my gmail account next
> time.

I appreciate that you're probably not being paid to look at this anymore,
but have you had a chance to address the comments I made on version 5 of
this series? As far as I'm concerned, this is almost ready for merging and
I'd love to see it get in for 3.14!

It would be a real shame if this ended up getting dropped...

Cheers,

Will

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v5 2/7] arm64: introduce interfaces to hotpatch kernel and module code
  2013-11-27 12:20         ` Will Deacon
@ 2013-11-27 15:40           ` Jiang Liu
  -1 siblings, 0 replies; 54+ messages in thread
From: Jiang Liu @ 2013-11-27 15:40 UTC (permalink / raw)
  To: Will Deacon
  Cc: Steven Rostedt, Catalin Marinas, Sandeepa Prabhu, Jiang Liu,
	linux-arm-kernel, linux-kernel

On 11/27/2013 08:20 PM, Will Deacon wrote:
> Hi Jiang Liu,
> 
> On Sun, Nov 03, 2013 at 03:55:21PM +0000, Jiang Liu wrote:
>> On 10/30/2013 08:12 AM, Will Deacon wrote:
>>> If I try and email you at your Huawei address, I get a bounce from the mail
>>> server. Is that expected? If so, it's not very helpful from a commit log
>>> perspective if you use that address as the author on your patches.
>>>
>> Hi Will,
>> 	Sorry for the inconvenience. I have left Huawei recently and
>> have had a vacation last two weeks. I will use my gmail account next
>> time.
> 
> I appreciate that you're probably not being paid to look at this anymore,
> but have you had a chance to address the comments I made on version 5 of
> this series? As far as I'm concerned, this is almost ready for merging and
> I'd love to see it get in for 3.14!
> 
> It would be a real shame if this ended up getting dropped...
Hi Will,
	Thanks for reminder!
	I have changed the code according to review comments,
but I'm having troubles to find a real hardware platform to test
the new code since I left Huawei. I will try to figure out a way
to test the code by contacting ex-teammember at Huawei.
	Hope it could catch the of 3.14 merging window.
Cheers,
Gerry
> 
> Cheers,
> 
> Will
> 


^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH v5 2/7] arm64: introduce interfaces to hotpatch kernel and module code
@ 2013-11-27 15:40           ` Jiang Liu
  0 siblings, 0 replies; 54+ messages in thread
From: Jiang Liu @ 2013-11-27 15:40 UTC (permalink / raw)
  To: linux-arm-kernel

On 11/27/2013 08:20 PM, Will Deacon wrote:
> Hi Jiang Liu,
> 
> On Sun, Nov 03, 2013 at 03:55:21PM +0000, Jiang Liu wrote:
>> On 10/30/2013 08:12 AM, Will Deacon wrote:
>>> If I try and email you at your Huawei address, I get a bounce from the mail
>>> server. Is that expected? If so, it's not very helpful from a commit log
>>> perspective if you use that address as the author on your patches.
>>>
>> Hi Will,
>> 	Sorry for the inconvenience. I have left Huawei recently and
>> have had a vacation last two weeks. I will use my gmail account next
>> time.
> 
> I appreciate that you're probably not being paid to look at this anymore,
> but have you had a chance to address the comments I made on version 5 of
> this series? As far as I'm concerned, this is almost ready for merging and
> I'd love to see it get in for 3.14!
> 
> It would be a real shame if this ended up getting dropped...
Hi Will,
	Thanks for reminder!
	I have changed the code according to review comments,
but I'm having troubles to find a real hardware platform to test
the new code since I left Huawei. I will try to figure out a way
to test the code by contacting ex-teammember at Huawei.
	Hope it could catch the of 3.14 merging window.
Cheers,
Gerry
> 
> Cheers,
> 
> Will
> 

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v5 2/7] arm64: introduce interfaces to hotpatch kernel and module code
  2013-11-03 15:55       ` Jiang Liu
@ 2013-11-28 22:39         ` AKASHI Takahiro
  -1 siblings, 0 replies; 54+ messages in thread
From: AKASHI Takahiro @ 2013-11-28 22:39 UTC (permalink / raw)
  To: Jiang Liu, Will Deacon
  Cc: Sandeepa Prabhu, Catalin Marinas, linux-kernel, Steven Rostedt,
	linux-arm-kernel

On 11/04/2013 12:55 AM, Jiang Liu wrote:
> On 10/30/2013 08:12 AM, Will Deacon wrote:
>> Hi Jinag Liu,
>>
>>> +static __always_inline u32 aarch64_insn_read(void *addr)
>>> +{
>>> +	return le32_to_cpu(*(u32 *)addr);
>>> +}
>>>
>>> +static __always_inline void aarch64_insn_write(void *addr, u32 insn)
>>> +{
>>> +	*(u32 *)addr = cpu_to_le32(insn);
>>> +}
>>
>> I wouldn't bother with these helpers. You should probably be using
>> probe_kernel_address or similar, then doing the endianness swabbing on the
>> return value in-line.
> How about keeping and refining aarch64_insn_read/write interfaces
> by using probe_kernel_address()? I think they may be used in other
> places when supporting big endian ARM64 kernel.

I prefer it (using probe_kernel_read/write) for my ftrace patch.
I would be able to replace some portion of my own function (ftrace_modify_code)
to aarch64_insn_patch_text_nosync().

See my comment here:
http://lists.infradead.org/pipermail/linux-arm-kernel/2013-October/207001.html
([PATCH 2/6])

Current implementation assumes stop_machine (via arch_ftrace_update_code()
in generic ftrace), and, given the discussion btw you and Will, I wonder
that it might be relaxed because ftrace on arm64 modifies only a single branch
or nop instruction at any time.

-Takahiro AKASHI

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH v5 2/7] arm64: introduce interfaces to hotpatch kernel and module code
@ 2013-11-28 22:39         ` AKASHI Takahiro
  0 siblings, 0 replies; 54+ messages in thread
From: AKASHI Takahiro @ 2013-11-28 22:39 UTC (permalink / raw)
  To: linux-arm-kernel

On 11/04/2013 12:55 AM, Jiang Liu wrote:
> On 10/30/2013 08:12 AM, Will Deacon wrote:
>> Hi Jinag Liu,
>>
>>> +static __always_inline u32 aarch64_insn_read(void *addr)
>>> +{
>>> +	return le32_to_cpu(*(u32 *)addr);
>>> +}
>>>
>>> +static __always_inline void aarch64_insn_write(void *addr, u32 insn)
>>> +{
>>> +	*(u32 *)addr = cpu_to_le32(insn);
>>> +}
>>
>> I wouldn't bother with these helpers. You should probably be using
>> probe_kernel_address or similar, then doing the endianness swabbing on the
>> return value in-line.
> How about keeping and refining aarch64_insn_read/write interfaces
> by using probe_kernel_address()? I think they may be used in other
> places when supporting big endian ARM64 kernel.

I prefer it (using probe_kernel_read/write) for my ftrace patch.
I would be able to replace some portion of my own function (ftrace_modify_code)
to aarch64_insn_patch_text_nosync().

See my comment here:
http://lists.infradead.org/pipermail/linux-arm-kernel/2013-October/207001.html
([PATCH 2/6])

Current implementation assumes stop_machine (via arch_ftrace_update_code()
in generic ftrace), and, given the discussion btw you and Will, I wonder
that it might be relaxed because ftrace on arm64 modifies only a single branch
or nop instruction at any time.

-Takahiro AKASHI

^ permalink raw reply	[flat|nested] 54+ messages in thread

end of thread, other threads:[~2013-11-28 22:39 UTC | newest]

Thread overview: 54+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-10-18 15:19 [PATCH v5 0/7] Optimize jump label implementation for ARM64 Jiang Liu
2013-10-18 15:19 ` Jiang Liu
2013-10-18 15:19 ` [PATCH v5 1/7] arm64: introduce basic aarch64 instruction decoding helpers Jiang Liu
2013-10-18 15:19   ` Jiang Liu
2013-10-31 17:39   ` Catalin Marinas
2013-10-31 17:39     ` Catalin Marinas
2013-11-06 15:10     ` Jiang Liu
2013-11-06 15:10       ` Jiang Liu
2013-10-18 15:19 ` [PATCH v5 2/7] arm64: introduce interfaces to hotpatch kernel and module code Jiang Liu
2013-10-18 15:19   ` Jiang Liu
2013-10-30  0:12   ` Will Deacon
2013-10-30  0:12     ` Will Deacon
2013-11-03 15:55     ` Jiang Liu
2013-11-03 15:55       ` Jiang Liu
2013-11-04 15:12       ` Sandeepa Prabhu
2013-11-04 15:12         ` Sandeepa Prabhu
2013-11-06 10:41         ` Will Deacon
2013-11-06 10:41           ` Will Deacon
2013-11-06 16:12           ` Jiang Liu
2013-11-06 16:12             ` Jiang Liu
2013-11-06 18:09             ` Will Deacon
2013-11-06 18:09               ` Will Deacon
2013-11-27 12:20       ` Will Deacon
2013-11-27 12:20         ` Will Deacon
2013-11-27 15:40         ` Jiang Liu
2013-11-27 15:40           ` Jiang Liu
2013-11-28 22:39       ` AKASHI Takahiro
2013-11-28 22:39         ` AKASHI Takahiro
2013-10-18 15:19 ` [PATCH v5 3/7] arm64: move encode_insn_immediate() from module.c to insn.c Jiang Liu
2013-10-18 15:19   ` Jiang Liu
2013-10-18 15:19 ` [PATCH v5 4/7] arm64: introduce aarch64_insn_gen_{nop|branch_imm}() helper functions Jiang Liu
2013-10-18 15:19   ` Jiang Liu
2013-10-30  0:48   ` Will Deacon
2013-10-30  0:48     ` Will Deacon
2013-11-06 16:31     ` Jiang Liu
2013-11-06 16:31       ` Jiang Liu
2013-11-06 16:45     ` Steven Rostedt
2013-11-06 16:45       ` Steven Rostedt
2013-11-06 18:02       ` Will Deacon
2013-11-06 18:02         ` Will Deacon
2013-10-18 15:19 ` [PATCH v5 5/7] arm64, jump label: detect %c support for ARM64 Jiang Liu
2013-10-18 15:19   ` Jiang Liu
2013-10-30  0:49   ` Will Deacon
2013-10-30  0:49     ` Will Deacon
2013-11-06 16:32     ` Jiang Liu
2013-11-06 16:32       ` Jiang Liu
2013-10-18 15:20 ` [PATCH v5 6/7] arm64, jump label: optimize jump label implementation Jiang Liu
2013-10-18 15:20   ` Jiang Liu
2013-10-30  0:55   ` Will Deacon
2013-10-30  0:55     ` Will Deacon
2013-11-06 16:38     ` Jiang Liu
2013-11-06 16:38       ` Jiang Liu
2013-10-18 15:20 ` [PATCH v5 7/7] jump_label: use defined macros instead of hard-coding for better readability Jiang Liu
2013-10-18 15:20   ` Jiang Liu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.