All of lore.kernel.org
 help / color / mirror / Atom feed
From: Wang ShaoBo <bobo.shaobowang@huawei.com>
To: unlisted-recipients:; (no To-header on input)
Cc: <cj.chengjian@huawei.com>, <huawei.libin@huawei.com>,
	<xiexiuqi@huawei.com>, <liwei391@huawei.com>,
	<linux-kernel@vger.kernel.org>,
	<linux-arm-kernel@lists.infradead.org>, <catalin.marinas@arm.com>,
	<will@kernel.org>, <rostedt@goodmis.org>, <mark.rutland@arm.com>,
	<bobo.shaobowang@huawei.com>, <zengshun.wu@outlook.com>
Subject: [RFC PATCH -next v2 4/4] arm64/ftrace: implement long jump for dynamic trampolines
Date: Wed, 16 Mar 2022 18:01:32 +0800	[thread overview]
Message-ID: <20220316100132.244849-5-bobo.shaobowang@huawei.com> (raw)
In-Reply-To: <20220316100132.244849-1-bobo.shaobowang@huawei.com>

From: Cheng Jian <cj.chengjian@huawei.com>

When kaslr is enabled, long jump should be implemented for jumping
to according trampoline and to callback from this trampoline.

Firstly, we use -fpatchable-function-entry=5,3 to reserve 3 NOPs at
the end of each function, and also 2 NOPs at the head as 3b23e4991fb6
("arm64: implement ftrace with regs") does, so rec->ip should be
rec->ip + 4 after change in case there is no extra instrument
(such as bti ) be put the head.

But should note this will increase the size of the text section:
 //Using defconfig with -fpatchable-function-entry=5,3://
   text        data      bss     dec        hex     filename
  27095761  14578580   533268  42207609  2840979    vmlinux

 //Using defconfig with -fpatchable-function-entry=2://
   text        data      bss     dec        hex     filename
  26394597  14575988   533268  41503853  2794c6d    vmlinux

When registering new ftrace function, we put this ftrace_ops registered
into our trampoline allocated dynamically but not the ftrace_ops_list,

   .text section:
     funcA:         |    funcA():       funcB():    |
      `->  +-----+  |    |   ...         mov x9 x30 |
           | ... |  |    |   adrp   <-   bl  <>     |
           | nop |  |    |   add
           | nop |  |    |   br   x16 ---+
     funcB | nop |  |                    | ftrace_(regs_)caller_tramp:
      `->  +-----+  |                    `--> +---------------------+
           | nop |  |                         | ...                 |
           | nop |  |       ftrace callsite   +---------------------+
           | ... |  |                `----->  | PLT entry:          |
           | nop |  |                         |      adrp           |
           | nop |  |                         |      add            |
    funcC: | nop |  | ftrace graph callsite   |      br   x16       |
      `->  +-----+  |                `----->  +---------------------+
           | nop |  |                         | ...                 |
           | nop |  |                         +---------------------+

There are three steps to concreate this conection when registering
a ftrace_ops for a function to be traced (Take funcB as example):

Step 1: use make_init_nop() to generate 'mov x9 x30' for LR reservation.
          funcB:
          | mov x9 x30
          | nop        //rec->ip
          | ...

Step 2: create trampoline and use ftrace_make_call() to update plt entry:
          funcA:
          | ...
          | adrp
          | add
	  | br   x16   //pointer to trampoline allocated

Step 3: fill plt entry to where ftrace_ops(addr stored in x2) contained:
          trampoline:
          | ...
          | adrp
          | add
          | br   x16   //pointer to callback handler

At the end automically replacing nop to bl in funB to enable profiling.
          funcB:
          | mov x9 x30
          | bl <>      //to first adrp in funA
          | ...

If funcB was bound to another ftrace_ops, Step2 and Step3 will be repeated,
If funcB was bound to multiple ftrace_ops, this trampoline will not be used
but jump to ftrace_(regs)_caller.

Signed-off-by: Cheng Jian <cj.chengjian@huawei.com>
Signed-off-by: Wang ShaoBo <bobo.shaobowang@huawei.com>
---
 arch/arm64/Makefile              |   2 +-
 arch/arm64/include/asm/ftrace.h  |   2 +-
 arch/arm64/include/asm/module.h  |   9 +++
 arch/arm64/kernel/entry-ftrace.S |   8 ++
 arch/arm64/kernel/ftrace.c       | 127 +++++++++++++++++--------------
 arch/arm64/kernel/module-plts.c  |  50 ++++++++++++
 6 files changed, 140 insertions(+), 58 deletions(-)

diff --git a/arch/arm64/Makefile b/arch/arm64/Makefile
index 2f1de88651e6..3139d6d9ee52 100644
--- a/arch/arm64/Makefile
+++ b/arch/arm64/Makefile
@@ -130,7 +130,7 @@ CHECKFLAGS	+= -D__aarch64__
 
 ifeq ($(CONFIG_DYNAMIC_FTRACE_WITH_REGS),y)
   KBUILD_CPPFLAGS += -DCC_USING_PATCHABLE_FUNCTION_ENTRY
-  CC_FLAGS_FTRACE := -fpatchable-function-entry=2
+  CC_FLAGS_FTRACE := -fpatchable-function-entry=5,3
 endif
 
 # Default value
diff --git a/arch/arm64/include/asm/ftrace.h b/arch/arm64/include/asm/ftrace.h
index 1494cfa8639b..d0562c13b1dc 100644
--- a/arch/arm64/include/asm/ftrace.h
+++ b/arch/arm64/include/asm/ftrace.h
@@ -70,7 +70,7 @@ static inline unsigned long ftrace_call_adjust(unsigned long addr)
 	 * See ftrace_init_nop() for the callsite sequence.
 	 */
 	if (IS_ENABLED(CONFIG_DYNAMIC_FTRACE_WITH_REGS))
-		return addr + AARCH64_INSN_SIZE;
+		return addr + AARCH64_INSN_SIZE + sizeof(struct plt_entry);
 	/*
 	 * addr is the address of the mcount call instruction.
 	 * recordmcount does the necessary offset calculation.
diff --git a/arch/arm64/include/asm/module.h b/arch/arm64/include/asm/module.h
index 4e7fa2623896..30f5b1cd5c9c 100644
--- a/arch/arm64/include/asm/module.h
+++ b/arch/arm64/include/asm/module.h
@@ -36,6 +36,14 @@ extern u64 module_alloc_base;
 #define module_alloc_base	((u64)_etext - MODULES_VSIZE)
 #endif
 
+enum {
+	PLT_MAKE_NOP,
+	PLT_MAKE_CALL,
+	PLT_MAKE_CALL_LINK,
+	PLT_MAKE_LCALL,
+	PLT_MAKE_LCALL_LINK
+};
+
 struct plt_entry {
 	/*
 	 * A program that conforms to the AArch64 Procedure Call Standard
@@ -58,6 +66,7 @@ static inline bool is_forbidden_offset_for_adrp(void *place)
 }
 
 struct plt_entry get_plt_entry(u64 dst, void *pc);
+struct plt_entry update_plt_entry(u64 pc, u64 dst, int record);
 bool plt_entries_equal(const struct plt_entry *a, const struct plt_entry *b);
 
 static inline bool plt_entry_is_initialized(const struct plt_entry *e)
diff --git a/arch/arm64/kernel/entry-ftrace.S b/arch/arm64/kernel/entry-ftrace.S
index 88462d925446..71290f0e0740 100644
--- a/arch/arm64/kernel/entry-ftrace.S
+++ b/arch/arm64/kernel/entry-ftrace.S
@@ -154,8 +154,12 @@ SYM_INNER_LABEL(ftrace_regs_caller_tramp_ops, SYM_L_GLOBAL)
 	mov	x3, sp				// regs
 SYM_INNER_LABEL(ftrace_regs_caller_tramp_call, SYM_L_GLOBAL)
 	nop
+	nop
+	nop
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
 SYM_INNER_LABEL(ftrace_regs_caller_tramp_graph_call, SYM_L_GLOBAL)
+	nop
+	nop
 	nop					// ftrace_graph_caller()
 #endif
 	ftrace_regs_restore
@@ -175,8 +179,12 @@ SYM_INNER_LABEL(ftrace_caller_tramp_ops, SYM_L_GLOBAL)
 	mov	x3, sp				// regs
 SYM_INNER_LABEL(ftrace_caller_tramp_call, SYM_L_GLOBAL)
 	nop
+	nop
+	nop
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
 SYM_INNER_LABEL(ftrace_caller_tramp_graph_call, SYM_L_GLOBAL)
+	nop
+	nop
 	nop					// ftrace_graph_caller()
 #endif
 	ftrace_regs_restore
diff --git a/arch/arm64/kernel/ftrace.c b/arch/arm64/kernel/ftrace.c
index d35a042baf75..4011aceefe8c 100644
--- a/arch/arm64/kernel/ftrace.c
+++ b/arch/arm64/kernel/ftrace.c
@@ -234,27 +234,55 @@ ftrace_trampoline_get_func(struct ftrace_ops *ops, bool is_graph)
 	return func;
 }
 
+static struct plt_entry *
+update_ftrace_tramp_plt(unsigned long pc, unsigned long addr, bool enable, bool link)
+{
+	int i;
+	struct plt_entry entry;
+	u32 *pentry;
+	long offset;
+	struct plt_entry *plt = (struct plt_entry *)pc;
+
+	offset = (long)pc - (long)addr;
+
+	if (!enable)
+		entry = update_plt_entry(pc, addr, PLT_MAKE_NOP);
+	else if (offset < -SZ_128M || offset >= SZ_128M) {
+		if (link)
+			entry = update_plt_entry(pc, addr, PLT_MAKE_LCALL_LINK);
+		else
+			entry = update_plt_entry(pc, addr, PLT_MAKE_LCALL);
+	} else {
+		if (link)
+			entry = update_plt_entry(pc, addr, PLT_MAKE_CALL_LINK);
+		else
+			entry = update_plt_entry(pc, addr, PLT_MAKE_CALL);
+	}
+
+	pentry = (u32 *)&entry;
+	for (i = 0; i < sizeof(struct plt_entry) / AARCH64_INSN_SIZE; i++)
+		aarch64_insn_patch_text_nosync((void *)((u32*)pc + i), *(pentry + i));
+
+	return plt;
+}
+
 static int
 ftrace_trampoline_modify_call(struct ftrace_ops *ops, bool is_graph, bool active)
 {
 	unsigned long offset;
 	unsigned long pc;
 	ftrace_func_t func;
-	u32 nop, branch;
 
 	offset = calc_trampoline_call_offset(ops->flags &
 				FTRACE_OPS_FL_SAVE_REGS, is_graph);
 	pc = ops->trampoline + offset;
 
 	func = ftrace_trampoline_get_func(ops, is_graph);
-	nop = aarch64_insn_gen_nop();
-	branch = aarch64_insn_gen_branch_imm(pc, (unsigned long)func,
-				AARCH64_INSN_BRANCH_LINK);
 
-	if (active)
-		return ftrace_modify_code(pc, 0, branch, false);
-	else
-		return ftrace_modify_code(pc, 0, nop, false);
+	if (update_ftrace_tramp_plt(pc, (unsigned long)func, active, true))
+		return 0;
+
+	return -EINVAL;
 }
 
 extern int ftrace_graph_active;
@@ -262,13 +290,6 @@ void arch_ftrace_update_trampoline(struct ftrace_ops *ops)
 {
 	unsigned int size;
 
-	/*
-	 * If kaslr is enabled, the address of tramp and ftrace call
-	 * may be far away, Therefore, long jump support is required.
-	 */
-	if (IS_ENABLED(CONFIG_RANDOMIZE_BASE))
-		return;
-
 	if (!ops->trampoline) {
 		ops->trampoline = create_trampoline(ops, &size);
 		if (!ops->trampoline)
@@ -349,17 +370,34 @@ int ftrace_update_ftrace_func(ftrace_func_t func)
 	return ftrace_modify_code(pc, 0, new, false);
 }
 
-static struct plt_entry *get_ftrace_plt(struct module *mod, unsigned long addr)
+extern void ftrace_kernel_plt(void);
+static struct plt_entry *
+get_ftrace_plt(struct module *mod, unsigned long pc, unsigned long addr)
 {
-#ifdef CONFIG_ARM64_MODULE_PLTS
-	struct plt_entry *plt = mod->arch.ftrace_trampolines;
+	struct plt_entry *plt;
 
-	if (addr == FTRACE_ADDR)
-		return &plt[FTRACE_PLT_IDX];
-	if (addr == FTRACE_REGS_ADDR &&
-	    IS_ENABLED(CONFIG_DYNAMIC_FTRACE_WITH_REGS))
-		return &plt[FTRACE_REGS_PLT_IDX];
+	if (mod) {
+#ifdef CONFIG_ARM64_MODULE_PLTS
+		plt = mod->arch.ftrace_trampolines;
+
+		if (addr == FTRACE_ADDR)
+			return &plt[FTRACE_PLT_IDX];
+		else if (addr == FTRACE_REGS_ADDR &&
+			IS_ENABLED(CONFIG_FTRACE_WITH_REGS))
+			return &plt[FTRACE_REGS_PLT_IDX];
+		else {
 #endif
+			pc = pc - sizeof(struct plt_entry) - AARCH64_INSN_SIZE;
+			return update_ftrace_tramp_plt(pc, addr, true, false);
+#ifdef CONFIG_ARM64_MODULE_PLTS
+		}
+#endif
+	} else {
+		WARN_ON(addr == FTRACE_ADDR || addr == FTRACE_REGS_ADDR);
+		pc = pc - sizeof(struct plt_entry) - AARCH64_INSN_SIZE;
+		return update_ftrace_tramp_plt(pc, addr, true, false);
+	}
+
 	return NULL;
 }
 
@@ -376,9 +414,6 @@ int ftrace_make_call(struct dyn_ftrace *rec, unsigned long addr)
 		struct module *mod;
 		struct plt_entry *plt;
 
-		if (!IS_ENABLED(CONFIG_ARM64_MODULE_PLTS))
-			return -EINVAL;
-
 		/*
 		 * On kernels that support module PLTs, the offset between the
 		 * branch instruction and its target may legally exceed the
@@ -393,12 +428,10 @@ int ftrace_make_call(struct dyn_ftrace *rec, unsigned long addr)
 		mod = __module_text_address(pc);
 		preempt_enable();
 
-		if (WARN_ON(!mod))
-			return -EINVAL;
-
-		plt = get_ftrace_plt(mod, addr);
+		plt = get_ftrace_plt(mod, pc, addr);
 		if (!plt) {
-			pr_err("ftrace: no module PLT for %ps\n", (void *)addr);
+			pr_err("ftrace: no %s PLT for %ps\n",
+					mod ? "module" : "kernel", (void *)addr);
 			return -EINVAL;
 		}
 
@@ -474,36 +507,18 @@ int ftrace_make_nop(struct module *mod, struct dyn_ftrace *rec,
 	if (offset < -SZ_128M || offset >= SZ_128M) {
 		u32 replaced;
 
-		if (!IS_ENABLED(CONFIG_ARM64_MODULE_PLTS))
-			return -EINVAL;
-
-		/*
-		 * 'mod' is only set at module load time, but if we end up
-		 * dealing with an out-of-range condition, we can assume it
-		 * is due to a module being loaded far away from the kernel.
-		 */
-		if (!mod) {
-			preempt_disable();
-			mod = __module_text_address(pc);
-			preempt_enable();
-
-			if (WARN_ON(!mod))
-				return -EINVAL;
-		}
-
 		/*
-		 * The instruction we are about to patch may be a branch and
-		 * link instruction that was redirected via a PLT entry. In
-		 * this case, the normal validation will fail, but we can at
-		 * least check that we are dealing with a branch and link
-		 * instruction that points into the right module.
+		 * The instruction we are about to patch may be a branch
+		 * and link instruction that was redirected via a PLT entry
+		 * supported by a dynamic trampoline or module, in this case,
+		 * the normal validation will fail, but we can at least check
+		 * that we are dealing with a branch and link instruction
+		 * that points into the right place.
 		 */
 		if (aarch64_insn_read((void *)pc, &replaced))
 			return -EFAULT;
 
-		if (!aarch64_insn_is_bl(replaced) ||
-		    !within_module(pc + aarch64_get_branch_offset(replaced),
-				   mod))
+		if (!aarch64_insn_is_bl(replaced))
 			return -EINVAL;
 
 		validate = false;
diff --git a/arch/arm64/kernel/module-plts.c b/arch/arm64/kernel/module-plts.c
index e53493d8b208..b7a832dfaa69 100644
--- a/arch/arm64/kernel/module-plts.c
+++ b/arch/arm64/kernel/module-plts.c
@@ -37,6 +37,56 @@ struct plt_entry get_plt_entry(u64 dst, void *pc)
 	return plt;
 }
 
+struct plt_entry update_plt_entry(u64 pc, u64 dst, int record)
+{
+	u32 adrp, add, br;
+
+	switch (record) {
+	case PLT_MAKE_NOP:
+		adrp = aarch64_insn_gen_nop();
+		add = aarch64_insn_gen_nop();
+		br = aarch64_insn_gen_nop();
+		break;
+
+	case PLT_MAKE_CALL_LINK:
+		adrp = aarch64_insn_gen_nop();
+		add = aarch64_insn_gen_nop();
+		br = aarch64_insn_gen_branch_imm(pc + 8, dst,
+						 AARCH64_INSN_BRANCH_LINK);
+		break;
+
+	case PLT_MAKE_LCALL_LINK:
+		adrp = aarch64_insn_gen_adr(pc, dst, AARCH64_INSN_REG_16,
+					    AARCH64_INSN_ADR_TYPE_ADRP);
+		add = aarch64_insn_gen_add_sub_imm(AARCH64_INSN_REG_16,
+						   AARCH64_INSN_REG_16, dst % SZ_4K,
+						   AARCH64_INSN_VARIANT_64BIT,
+						   AARCH64_INSN_ADSB_ADD);
+		br = aarch64_insn_gen_branch_reg(AARCH64_INSN_REG_16,
+						 AARCH64_INSN_BRANCH_LINK);
+		break;
+
+	case PLT_MAKE_LCALL:
+		adrp = aarch64_insn_gen_adr(pc, dst, AARCH64_INSN_REG_16,
+					    AARCH64_INSN_ADR_TYPE_ADRP);
+		add = aarch64_insn_gen_add_sub_imm(AARCH64_INSN_REG_16,
+						   AARCH64_INSN_REG_16,
+						   dst % SZ_4K,
+						   AARCH64_INSN_VARIANT_64BIT,
+						   AARCH64_INSN_ADSB_ADD);
+		br = aarch64_insn_gen_branch_reg(AARCH64_INSN_REG_16,
+						 AARCH64_INSN_BRANCH_NOLINK);
+		break;
+	default:
+		pr_err("[%s %d] error flag %d\n", __func__, __LINE__, record);
+		BUG();
+		break;
+	}
+
+	return (struct plt_entry){ cpu_to_le32(adrp), cpu_to_le32(add),
+				cpu_to_le32(br) };
+}
+
 bool plt_entries_equal(const struct plt_entry *a, const struct plt_entry *b)
 {
 	u64 p, q;
-- 
2.25.1


WARNING: multiple messages have this Message-ID (diff)
From: Wang ShaoBo <bobo.shaobowang@huawei.com>
Cc: <cj.chengjian@huawei.com>, <huawei.libin@huawei.com>,
	<xiexiuqi@huawei.com>, <liwei391@huawei.com>,
	<linux-kernel@vger.kernel.org>,
	 <linux-arm-kernel@lists.infradead.org>,
	<catalin.marinas@arm.com>, <will@kernel.org>,
	<rostedt@goodmis.org>, <mark.rutland@arm.com>,
	<bobo.shaobowang@huawei.com>, <zengshun.wu@outlook.com>
Subject: [RFC PATCH -next v2 4/4] arm64/ftrace: implement long jump for dynamic trampolines
Date: Wed, 16 Mar 2022 18:01:32 +0800	[thread overview]
Message-ID: <20220316100132.244849-5-bobo.shaobowang@huawei.com> (raw)
In-Reply-To: <20220316100132.244849-1-bobo.shaobowang@huawei.com>

From: Cheng Jian <cj.chengjian@huawei.com>

When kaslr is enabled, long jump should be implemented for jumping
to according trampoline and to callback from this trampoline.

Firstly, we use -fpatchable-function-entry=5,3 to reserve 3 NOPs at
the end of each function, and also 2 NOPs at the head as 3b23e4991fb6
("arm64: implement ftrace with regs") does, so rec->ip should be
rec->ip + 4 after change in case there is no extra instrument
(such as bti ) be put the head.

But should note this will increase the size of the text section:
 //Using defconfig with -fpatchable-function-entry=5,3://
   text        data      bss     dec        hex     filename
  27095761  14578580   533268  42207609  2840979    vmlinux

 //Using defconfig with -fpatchable-function-entry=2://
   text        data      bss     dec        hex     filename
  26394597  14575988   533268  41503853  2794c6d    vmlinux

When registering new ftrace function, we put this ftrace_ops registered
into our trampoline allocated dynamically but not the ftrace_ops_list,

   .text section:
     funcA:         |    funcA():       funcB():    |
      `->  +-----+  |    |   ...         mov x9 x30 |
           | ... |  |    |   adrp   <-   bl  <>     |
           | nop |  |    |   add
           | nop |  |    |   br   x16 ---+
     funcB | nop |  |                    | ftrace_(regs_)caller_tramp:
      `->  +-----+  |                    `--> +---------------------+
           | nop |  |                         | ...                 |
           | nop |  |       ftrace callsite   +---------------------+
           | ... |  |                `----->  | PLT entry:          |
           | nop |  |                         |      adrp           |
           | nop |  |                         |      add            |
    funcC: | nop |  | ftrace graph callsite   |      br   x16       |
      `->  +-----+  |                `----->  +---------------------+
           | nop |  |                         | ...                 |
           | nop |  |                         +---------------------+

There are three steps to concreate this conection when registering
a ftrace_ops for a function to be traced (Take funcB as example):

Step 1: use make_init_nop() to generate 'mov x9 x30' for LR reservation.
          funcB:
          | mov x9 x30
          | nop        //rec->ip
          | ...

Step 2: create trampoline and use ftrace_make_call() to update plt entry:
          funcA:
          | ...
          | adrp
          | add
	  | br   x16   //pointer to trampoline allocated

Step 3: fill plt entry to where ftrace_ops(addr stored in x2) contained:
          trampoline:
          | ...
          | adrp
          | add
          | br   x16   //pointer to callback handler

At the end automically replacing nop to bl in funB to enable profiling.
          funcB:
          | mov x9 x30
          | bl <>      //to first adrp in funA
          | ...

If funcB was bound to another ftrace_ops, Step2 and Step3 will be repeated,
If funcB was bound to multiple ftrace_ops, this trampoline will not be used
but jump to ftrace_(regs)_caller.

Signed-off-by: Cheng Jian <cj.chengjian@huawei.com>
Signed-off-by: Wang ShaoBo <bobo.shaobowang@huawei.com>
---
 arch/arm64/Makefile              |   2 +-
 arch/arm64/include/asm/ftrace.h  |   2 +-
 arch/arm64/include/asm/module.h  |   9 +++
 arch/arm64/kernel/entry-ftrace.S |   8 ++
 arch/arm64/kernel/ftrace.c       | 127 +++++++++++++++++--------------
 arch/arm64/kernel/module-plts.c  |  50 ++++++++++++
 6 files changed, 140 insertions(+), 58 deletions(-)

diff --git a/arch/arm64/Makefile b/arch/arm64/Makefile
index 2f1de88651e6..3139d6d9ee52 100644
--- a/arch/arm64/Makefile
+++ b/arch/arm64/Makefile
@@ -130,7 +130,7 @@ CHECKFLAGS	+= -D__aarch64__
 
 ifeq ($(CONFIG_DYNAMIC_FTRACE_WITH_REGS),y)
   KBUILD_CPPFLAGS += -DCC_USING_PATCHABLE_FUNCTION_ENTRY
-  CC_FLAGS_FTRACE := -fpatchable-function-entry=2
+  CC_FLAGS_FTRACE := -fpatchable-function-entry=5,3
 endif
 
 # Default value
diff --git a/arch/arm64/include/asm/ftrace.h b/arch/arm64/include/asm/ftrace.h
index 1494cfa8639b..d0562c13b1dc 100644
--- a/arch/arm64/include/asm/ftrace.h
+++ b/arch/arm64/include/asm/ftrace.h
@@ -70,7 +70,7 @@ static inline unsigned long ftrace_call_adjust(unsigned long addr)
 	 * See ftrace_init_nop() for the callsite sequence.
 	 */
 	if (IS_ENABLED(CONFIG_DYNAMIC_FTRACE_WITH_REGS))
-		return addr + AARCH64_INSN_SIZE;
+		return addr + AARCH64_INSN_SIZE + sizeof(struct plt_entry);
 	/*
 	 * addr is the address of the mcount call instruction.
 	 * recordmcount does the necessary offset calculation.
diff --git a/arch/arm64/include/asm/module.h b/arch/arm64/include/asm/module.h
index 4e7fa2623896..30f5b1cd5c9c 100644
--- a/arch/arm64/include/asm/module.h
+++ b/arch/arm64/include/asm/module.h
@@ -36,6 +36,14 @@ extern u64 module_alloc_base;
 #define module_alloc_base	((u64)_etext - MODULES_VSIZE)
 #endif
 
+enum {
+	PLT_MAKE_NOP,
+	PLT_MAKE_CALL,
+	PLT_MAKE_CALL_LINK,
+	PLT_MAKE_LCALL,
+	PLT_MAKE_LCALL_LINK
+};
+
 struct plt_entry {
 	/*
 	 * A program that conforms to the AArch64 Procedure Call Standard
@@ -58,6 +66,7 @@ static inline bool is_forbidden_offset_for_adrp(void *place)
 }
 
 struct plt_entry get_plt_entry(u64 dst, void *pc);
+struct plt_entry update_plt_entry(u64 pc, u64 dst, int record);
 bool plt_entries_equal(const struct plt_entry *a, const struct plt_entry *b);
 
 static inline bool plt_entry_is_initialized(const struct plt_entry *e)
diff --git a/arch/arm64/kernel/entry-ftrace.S b/arch/arm64/kernel/entry-ftrace.S
index 88462d925446..71290f0e0740 100644
--- a/arch/arm64/kernel/entry-ftrace.S
+++ b/arch/arm64/kernel/entry-ftrace.S
@@ -154,8 +154,12 @@ SYM_INNER_LABEL(ftrace_regs_caller_tramp_ops, SYM_L_GLOBAL)
 	mov	x3, sp				// regs
 SYM_INNER_LABEL(ftrace_regs_caller_tramp_call, SYM_L_GLOBAL)
 	nop
+	nop
+	nop
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
 SYM_INNER_LABEL(ftrace_regs_caller_tramp_graph_call, SYM_L_GLOBAL)
+	nop
+	nop
 	nop					// ftrace_graph_caller()
 #endif
 	ftrace_regs_restore
@@ -175,8 +179,12 @@ SYM_INNER_LABEL(ftrace_caller_tramp_ops, SYM_L_GLOBAL)
 	mov	x3, sp				// regs
 SYM_INNER_LABEL(ftrace_caller_tramp_call, SYM_L_GLOBAL)
 	nop
+	nop
+	nop
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
 SYM_INNER_LABEL(ftrace_caller_tramp_graph_call, SYM_L_GLOBAL)
+	nop
+	nop
 	nop					// ftrace_graph_caller()
 #endif
 	ftrace_regs_restore
diff --git a/arch/arm64/kernel/ftrace.c b/arch/arm64/kernel/ftrace.c
index d35a042baf75..4011aceefe8c 100644
--- a/arch/arm64/kernel/ftrace.c
+++ b/arch/arm64/kernel/ftrace.c
@@ -234,27 +234,55 @@ ftrace_trampoline_get_func(struct ftrace_ops *ops, bool is_graph)
 	return func;
 }
 
+static struct plt_entry *
+update_ftrace_tramp_plt(unsigned long pc, unsigned long addr, bool enable, bool link)
+{
+	int i;
+	struct plt_entry entry;
+	u32 *pentry;
+	long offset;
+	struct plt_entry *plt = (struct plt_entry *)pc;
+
+	offset = (long)pc - (long)addr;
+
+	if (!enable)
+		entry = update_plt_entry(pc, addr, PLT_MAKE_NOP);
+	else if (offset < -SZ_128M || offset >= SZ_128M) {
+		if (link)
+			entry = update_plt_entry(pc, addr, PLT_MAKE_LCALL_LINK);
+		else
+			entry = update_plt_entry(pc, addr, PLT_MAKE_LCALL);
+	} else {
+		if (link)
+			entry = update_plt_entry(pc, addr, PLT_MAKE_CALL_LINK);
+		else
+			entry = update_plt_entry(pc, addr, PLT_MAKE_CALL);
+	}
+
+	pentry = (u32 *)&entry;
+	for (i = 0; i < sizeof(struct plt_entry) / AARCH64_INSN_SIZE; i++)
+		aarch64_insn_patch_text_nosync((void *)((u32*)pc + i), *(pentry + i));
+
+	return plt;
+}
+
 static int
 ftrace_trampoline_modify_call(struct ftrace_ops *ops, bool is_graph, bool active)
 {
 	unsigned long offset;
 	unsigned long pc;
 	ftrace_func_t func;
-	u32 nop, branch;
 
 	offset = calc_trampoline_call_offset(ops->flags &
 				FTRACE_OPS_FL_SAVE_REGS, is_graph);
 	pc = ops->trampoline + offset;
 
 	func = ftrace_trampoline_get_func(ops, is_graph);
-	nop = aarch64_insn_gen_nop();
-	branch = aarch64_insn_gen_branch_imm(pc, (unsigned long)func,
-				AARCH64_INSN_BRANCH_LINK);
 
-	if (active)
-		return ftrace_modify_code(pc, 0, branch, false);
-	else
-		return ftrace_modify_code(pc, 0, nop, false);
+	if (update_ftrace_tramp_plt(pc, (unsigned long)func, active, true))
+		return 0;
+
+	return -EINVAL;
 }
 
 extern int ftrace_graph_active;
@@ -262,13 +290,6 @@ void arch_ftrace_update_trampoline(struct ftrace_ops *ops)
 {
 	unsigned int size;
 
-	/*
-	 * If kaslr is enabled, the address of tramp and ftrace call
-	 * may be far away, Therefore, long jump support is required.
-	 */
-	if (IS_ENABLED(CONFIG_RANDOMIZE_BASE))
-		return;
-
 	if (!ops->trampoline) {
 		ops->trampoline = create_trampoline(ops, &size);
 		if (!ops->trampoline)
@@ -349,17 +370,34 @@ int ftrace_update_ftrace_func(ftrace_func_t func)
 	return ftrace_modify_code(pc, 0, new, false);
 }
 
-static struct plt_entry *get_ftrace_plt(struct module *mod, unsigned long addr)
+extern void ftrace_kernel_plt(void);
+static struct plt_entry *
+get_ftrace_plt(struct module *mod, unsigned long pc, unsigned long addr)
 {
-#ifdef CONFIG_ARM64_MODULE_PLTS
-	struct plt_entry *plt = mod->arch.ftrace_trampolines;
+	struct plt_entry *plt;
 
-	if (addr == FTRACE_ADDR)
-		return &plt[FTRACE_PLT_IDX];
-	if (addr == FTRACE_REGS_ADDR &&
-	    IS_ENABLED(CONFIG_DYNAMIC_FTRACE_WITH_REGS))
-		return &plt[FTRACE_REGS_PLT_IDX];
+	if (mod) {
+#ifdef CONFIG_ARM64_MODULE_PLTS
+		plt = mod->arch.ftrace_trampolines;
+
+		if (addr == FTRACE_ADDR)
+			return &plt[FTRACE_PLT_IDX];
+		else if (addr == FTRACE_REGS_ADDR &&
+			IS_ENABLED(CONFIG_FTRACE_WITH_REGS))
+			return &plt[FTRACE_REGS_PLT_IDX];
+		else {
 #endif
+			pc = pc - sizeof(struct plt_entry) - AARCH64_INSN_SIZE;
+			return update_ftrace_tramp_plt(pc, addr, true, false);
+#ifdef CONFIG_ARM64_MODULE_PLTS
+		}
+#endif
+	} else {
+		WARN_ON(addr == FTRACE_ADDR || addr == FTRACE_REGS_ADDR);
+		pc = pc - sizeof(struct plt_entry) - AARCH64_INSN_SIZE;
+		return update_ftrace_tramp_plt(pc, addr, true, false);
+	}
+
 	return NULL;
 }
 
@@ -376,9 +414,6 @@ int ftrace_make_call(struct dyn_ftrace *rec, unsigned long addr)
 		struct module *mod;
 		struct plt_entry *plt;
 
-		if (!IS_ENABLED(CONFIG_ARM64_MODULE_PLTS))
-			return -EINVAL;
-
 		/*
 		 * On kernels that support module PLTs, the offset between the
 		 * branch instruction and its target may legally exceed the
@@ -393,12 +428,10 @@ int ftrace_make_call(struct dyn_ftrace *rec, unsigned long addr)
 		mod = __module_text_address(pc);
 		preempt_enable();
 
-		if (WARN_ON(!mod))
-			return -EINVAL;
-
-		plt = get_ftrace_plt(mod, addr);
+		plt = get_ftrace_plt(mod, pc, addr);
 		if (!plt) {
-			pr_err("ftrace: no module PLT for %ps\n", (void *)addr);
+			pr_err("ftrace: no %s PLT for %ps\n",
+					mod ? "module" : "kernel", (void *)addr);
 			return -EINVAL;
 		}
 
@@ -474,36 +507,18 @@ int ftrace_make_nop(struct module *mod, struct dyn_ftrace *rec,
 	if (offset < -SZ_128M || offset >= SZ_128M) {
 		u32 replaced;
 
-		if (!IS_ENABLED(CONFIG_ARM64_MODULE_PLTS))
-			return -EINVAL;
-
-		/*
-		 * 'mod' is only set at module load time, but if we end up
-		 * dealing with an out-of-range condition, we can assume it
-		 * is due to a module being loaded far away from the kernel.
-		 */
-		if (!mod) {
-			preempt_disable();
-			mod = __module_text_address(pc);
-			preempt_enable();
-
-			if (WARN_ON(!mod))
-				return -EINVAL;
-		}
-
 		/*
-		 * The instruction we are about to patch may be a branch and
-		 * link instruction that was redirected via a PLT entry. In
-		 * this case, the normal validation will fail, but we can at
-		 * least check that we are dealing with a branch and link
-		 * instruction that points into the right module.
+		 * The instruction we are about to patch may be a branch
+		 * and link instruction that was redirected via a PLT entry
+		 * supported by a dynamic trampoline or module, in this case,
+		 * the normal validation will fail, but we can at least check
+		 * that we are dealing with a branch and link instruction
+		 * that points into the right place.
 		 */
 		if (aarch64_insn_read((void *)pc, &replaced))
 			return -EFAULT;
 
-		if (!aarch64_insn_is_bl(replaced) ||
-		    !within_module(pc + aarch64_get_branch_offset(replaced),
-				   mod))
+		if (!aarch64_insn_is_bl(replaced))
 			return -EINVAL;
 
 		validate = false;
diff --git a/arch/arm64/kernel/module-plts.c b/arch/arm64/kernel/module-plts.c
index e53493d8b208..b7a832dfaa69 100644
--- a/arch/arm64/kernel/module-plts.c
+++ b/arch/arm64/kernel/module-plts.c
@@ -37,6 +37,56 @@ struct plt_entry get_plt_entry(u64 dst, void *pc)
 	return plt;
 }
 
+struct plt_entry update_plt_entry(u64 pc, u64 dst, int record)
+{
+	u32 adrp, add, br;
+
+	switch (record) {
+	case PLT_MAKE_NOP:
+		adrp = aarch64_insn_gen_nop();
+		add = aarch64_insn_gen_nop();
+		br = aarch64_insn_gen_nop();
+		break;
+
+	case PLT_MAKE_CALL_LINK:
+		adrp = aarch64_insn_gen_nop();
+		add = aarch64_insn_gen_nop();
+		br = aarch64_insn_gen_branch_imm(pc + 8, dst,
+						 AARCH64_INSN_BRANCH_LINK);
+		break;
+
+	case PLT_MAKE_LCALL_LINK:
+		adrp = aarch64_insn_gen_adr(pc, dst, AARCH64_INSN_REG_16,
+					    AARCH64_INSN_ADR_TYPE_ADRP);
+		add = aarch64_insn_gen_add_sub_imm(AARCH64_INSN_REG_16,
+						   AARCH64_INSN_REG_16, dst % SZ_4K,
+						   AARCH64_INSN_VARIANT_64BIT,
+						   AARCH64_INSN_ADSB_ADD);
+		br = aarch64_insn_gen_branch_reg(AARCH64_INSN_REG_16,
+						 AARCH64_INSN_BRANCH_LINK);
+		break;
+
+	case PLT_MAKE_LCALL:
+		adrp = aarch64_insn_gen_adr(pc, dst, AARCH64_INSN_REG_16,
+					    AARCH64_INSN_ADR_TYPE_ADRP);
+		add = aarch64_insn_gen_add_sub_imm(AARCH64_INSN_REG_16,
+						   AARCH64_INSN_REG_16,
+						   dst % SZ_4K,
+						   AARCH64_INSN_VARIANT_64BIT,
+						   AARCH64_INSN_ADSB_ADD);
+		br = aarch64_insn_gen_branch_reg(AARCH64_INSN_REG_16,
+						 AARCH64_INSN_BRANCH_NOLINK);
+		break;
+	default:
+		pr_err("[%s %d] error flag %d\n", __func__, __LINE__, record);
+		BUG();
+		break;
+	}
+
+	return (struct plt_entry){ cpu_to_le32(adrp), cpu_to_le32(add),
+				cpu_to_le32(br) };
+}
+
 bool plt_entries_equal(const struct plt_entry *a, const struct plt_entry *b)
 {
 	u64 p, q;
-- 
2.25.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  parent reply	other threads:[~2022-03-16  9:52 UTC|newest]

Thread overview: 88+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-16 10:01 [RFC PATCH -next v2 0/4] arm64/ftrace: support dynamic trampoline Wang ShaoBo
2022-03-16 10:01 ` Wang ShaoBo
2022-03-16 10:01 ` [RFC PATCH -next v2 1/4] arm64: introduce aarch64_insn_gen_load_literal Wang ShaoBo
2022-03-16 10:01   ` Wang ShaoBo
2022-03-16 10:01 ` [RFC PATCH -next v2 2/4] arm64/ftrace: introduce ftrace dynamic trampoline entrances Wang ShaoBo
2022-03-16 10:01   ` Wang ShaoBo
2022-03-16 10:01 ` [RFC PATCH -next v2 3/4] arm64/ftrace: support dynamically allocated trampolines Wang ShaoBo
2022-03-16 10:01   ` Wang ShaoBo
2022-04-21 13:10   ` Mark Rutland
2022-04-21 13:10     ` Mark Rutland
2022-04-21 14:06     ` Steven Rostedt
2022-04-21 14:06       ` Steven Rostedt
2022-04-21 14:08       ` Steven Rostedt
2022-04-21 14:08         ` Steven Rostedt
2022-04-21 15:14       ` Mark Rutland
2022-04-21 15:14         ` Mark Rutland
2022-04-21 15:42         ` Steven Rostedt
2022-04-21 15:42           ` Steven Rostedt
2022-04-21 16:27           ` Mark Rutland
2022-04-21 16:27             ` Mark Rutland
2022-04-21 17:06             ` Steven Rostedt
2022-04-21 17:06               ` Steven Rostedt
2022-04-22 10:12               ` Mark Rutland
2022-04-22 10:12                 ` Mark Rutland
2022-04-22 15:45                 ` Steven Rostedt
2022-04-22 15:45                   ` Steven Rostedt
2022-04-22 17:27                   ` Mark Rutland
2022-04-22 17:27                     ` Mark Rutland
2022-04-26  8:47                     ` Masami Hiramatsu
2022-04-26  8:47                       ` Masami Hiramatsu
2022-05-04 10:24                       ` Mark Rutland
2022-05-04 10:24                         ` Mark Rutland
2022-05-05  3:15                         ` Masami Hiramatsu
2022-05-05  3:15                           ` Masami Hiramatsu
2022-05-09 18:22                           ` Steven Rostedt
2022-05-09 18:22                             ` Steven Rostedt
2022-05-10  9:10                             ` Masami Hiramatsu
2022-05-10  9:10                               ` Masami Hiramatsu
2022-05-10 14:44                               ` Steven Rostedt
2022-05-10 14:44                                 ` Steven Rostedt
2022-05-11 14:34                                 ` Masami Hiramatsu
2022-05-11 14:34                                   ` Masami Hiramatsu
2022-05-11 15:12                                   ` Steven Rostedt
2022-05-11 15:12                                     ` Steven Rostedt
2022-05-12 12:02                                     ` Masami Hiramatsu
2022-05-12 12:02                                       ` Masami Hiramatsu
2022-05-12 13:50                                       ` Steven Rostedt
2022-05-12 13:50                                         ` Steven Rostedt
2022-05-25 12:17                                       ` Mark Rutland
2022-05-25 12:17                                         ` Mark Rutland
2022-05-25 13:43                                         ` Steven Rostedt
2022-05-25 13:43                                           ` Steven Rostedt
2022-05-25 17:12                                           ` Mark Rutland
2022-05-25 17:12                                             ` Mark Rutland
2022-05-30  1:03                                         ` Masami Hiramatsu
2022-05-30  1:03                                           ` Masami Hiramatsu
2022-05-30 12:38                                           ` Jiri Olsa
2022-05-30 12:38                                             ` Jiri Olsa
2022-05-31  1:00                                             ` Masami Hiramatsu
2022-05-31  1:00                                               ` Masami Hiramatsu
2022-05-04 12:43               ` Mark Rutland
2022-05-04 12:43                 ` Mark Rutland
2022-05-05  2:57             ` Wangshaobo (bobo)
2022-05-05  2:57               ` Wangshaobo (bobo)
2022-05-25 12:27               ` Mark Rutland
2022-05-25 12:27                 ` Mark Rutland
2022-04-27  8:54       ` Wangshaobo (bobo)
2022-04-27  8:54         ` Wangshaobo (bobo)
2022-03-16 10:01 ` Wang ShaoBo [this message]
2022-03-16 10:01   ` [RFC PATCH -next v2 4/4] arm64/ftrace: implement long jump for dynamic trampolines Wang ShaoBo
2022-04-21 13:47   ` Mark Rutland
2022-04-21 13:47     ` Mark Rutland
2022-03-16 14:29 ` [RFC PATCH -next v2 0/4] arm64/ftrace: support dynamic trampoline Steven Rostedt
2022-03-16 14:29   ` Steven Rostedt
2022-04-20 18:11 ` Steven Rostedt
2022-04-20 18:11   ` Steven Rostedt
2022-04-21  1:13   ` Wangshaobo (bobo)
2022-04-21  1:13     ` Wangshaobo (bobo)
2022-04-21 12:37     ` Steven Rostedt
2022-04-21 12:37       ` Steven Rostedt
2022-05-25 12:45       ` Mark Rutland
2022-05-25 12:45         ` Mark Rutland
2022-05-25 13:58         ` Steven Rostedt
2022-05-25 13:58           ` Steven Rostedt
2022-05-25 17:26           ` Mark Rutland
2022-05-25 17:26             ` Mark Rutland
2022-04-21 12:53 ` Mark Rutland
2022-04-21 12:53   ` Mark Rutland

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220316100132.244849-5-bobo.shaobowang@huawei.com \
    --to=bobo.shaobowang@huawei.com \
    --cc=catalin.marinas@arm.com \
    --cc=cj.chengjian@huawei.com \
    --cc=huawei.libin@huawei.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=liwei391@huawei.com \
    --cc=mark.rutland@arm.com \
    --cc=rostedt@goodmis.org \
    --cc=will@kernel.org \
    --cc=xiexiuqi@huawei.com \
    --cc=zengshun.wu@outlook.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.