linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/4] arm64 live patching
@ 2018-10-01 14:09 Torsten Duwe
  2018-10-01 14:16 ` [PATCH v3 1/4] DYNAMIC_FTRACE configurable with and without REGS Torsten Duwe
                   ` (3 more replies)
  0 siblings, 4 replies; 48+ messages in thread
From: Torsten Duwe @ 2018-10-01 14:09 UTC (permalink / raw)
  To: Will Deacon, Catalin Marinas, Julien Thierry, Steven Rostedt,
	Josh Poimboeuf, Ingo Molnar, Ard Biesheuvel, Arnd Bergmann,
	AKASHI Takahiro
  Cc: linux-arm-kernel, linux-kernel, live-patching

Hi all!

Some substantial changes were requested, so I had to shuffle a few
things around. All the bigger changes are in now.

[Changes from v2]:

* ifeq($(CONFIG_DYNAMIC_FTRACE_WITH_REGS),y) instead of ifdef

* "fix" commit 06aeaaeabf69da4. (new patch 1)
  Made DYNAMIC_FTRACE_WITH_REGS a real choice. The current situation
  would be that a linux-4.20 kernel on arm64 should be built with
  gcc >= 8; as in this case, as well as all other archs, the "default y"
  works. Only kernels >= 4.20, arm64, gcc < 8, must change this to "n"
  in order to not be stopped by the Makefile $(error) from patch 2/4.
  You'll then fall back to the DYNAMIC_FTRACE, if selected, like before.

* use some S_X* constants to refer to offsets into pt_regs in assembly.

* have the compiler/assembler generate the mov x9,x30 instruction that
  saves LR at compile time, rather than generate it repeatedly at runtime.

* flip the ftrace_regs_caller stack frame so that it is no longer
  upside down, as Ard remarked. This change broke the graph caller somehow.

* extend handling of the module arch-dependent ftrace trampoline with
  a companion "regs" version.
  
* clear the _TIF_PATCH_PENDING on do_notify_resume()

* took care of arch/arm64/kernel/time.c when changing stack unwinder
  semantics

[TODO]

* use more S_X* constants

* run the full livepatch test suite, especially test apply_relocate_add()
  functionality late after module load.

[Changes from v1]:

* Missing compiler support is now a Makefile error, instead
  of a warning. This will keep the compile log shorter and
  it will thus be easier to spot the problem.

* A separate ftrace_regs_caller. Only that one will write out
  a complete pt_regs, for efficiency.

* Replace the use of X19 with X28 to remember the old PC during
  live patch detection, as only that is saved&restored now for
  non-regs ftrace.

* CONFIG_DYNAMIC_FTRACE_WITH_REGS and CC_USING_PATCHABLE_FUNCTION_ENTRY
  are currently synonymous on arm64, but differentiate better for
  the future when this is no longer the case.

* Clean up "old"/"new" insn value setting vs. #ifdefs.

* #define a INSN_MOV_X9_X30 with suggested aarch64_insn_gen call
  and use that instead of an immediate hex value.

	Torsten

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH v3 1/4] DYNAMIC_FTRACE configurable with and without REGS
  2018-10-01 14:09 [PATCH v3 0/4] arm64 live patching Torsten Duwe
@ 2018-10-01 14:16 ` Torsten Duwe
  2018-10-01 14:52   ` Ard Biesheuvel
  2018-10-01 14:16 ` [PATCH v3 2/4] arm64: implement ftrace with regs Torsten Duwe
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 48+ messages in thread
From: Torsten Duwe @ 2018-10-01 14:16 UTC (permalink / raw)
  To: Masami Hiramatsu, Will Deacon, Catalin Marinas, Julien Thierry,
	Steven Rostedt, Josh Poimboeuf, Ingo Molnar, Ard Biesheuvel,
	Arnd Bergmann, AKASHI Takahiro
  Cc: linux-arm-kernel, linux-kernel, live-patching

In commit 06aeaaeabf69da4, many ftrace-related config options are
consolidated. By accident, I guess, the choice about DYNAMIC_FTRACE
and DYNAMIC_FTRACE_WITH_REGS is no longer available explicitly but
determined by the sole availability on the architecture.

This makes it hard to introduce DYNAMIC_FTRACE_WITH_REGS if it depends
on new compiler features or other new properties of the toolchain
without breaking existing configurations.

This patch turns the def_bool into an actual choice. Should the toolchain
not meet the requirements for _WITH_REGS it can be turned off.

Signed-off-by: Torsten Duwe <duwe@suse.de>


--- a/kernel/trace/Kconfig
+++ b/kernel/trace/Kconfig
@@ -508,9 +508,15 @@ config DYNAMIC_FTRACE
 	  otherwise has native performance as long as no tracing is active.
 
 config DYNAMIC_FTRACE_WITH_REGS
-	def_bool y
+	bool "Include register content tracking in dynamic ftrace facility"
+	default y
 	depends on DYNAMIC_FTRACE
 	depends on HAVE_DYNAMIC_FTRACE_WITH_REGS
+	help
+	  This architecture supports the inspection of register contents,
+	  as passed between functions, at the dynamic ftrace points.
+	  This is also a prerequisite for Kernel Live Patching (KLP).
+	  When in doubt, say Y.
 
 config FUNCTION_PROFILER
 	bool "Kernel function profiler"

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH v3 2/4] arm64: implement ftrace with regs
  2018-10-01 14:09 [PATCH v3 0/4] arm64 live patching Torsten Duwe
  2018-10-01 14:16 ` [PATCH v3 1/4] DYNAMIC_FTRACE configurable with and without REGS Torsten Duwe
@ 2018-10-01 14:16 ` Torsten Duwe
  2018-10-01 15:57   ` Ard Biesheuvel
  2018-10-02 11:27   ` Mark Rutland
  2018-10-01 14:16 ` [PATCH v3 3/4] arm64: implement live patching Torsten Duwe
  2018-10-01 14:16 ` [PATCH v3 4/4] arm64: reliable stacktraces Torsten Duwe
  3 siblings, 2 replies; 48+ messages in thread
From: Torsten Duwe @ 2018-10-01 14:16 UTC (permalink / raw)
  To: Will Deacon, Catalin Marinas, Julien Thierry, Steven Rostedt,
	Josh Poimboeuf, Ingo Molnar, Ard Biesheuvel, Arnd Bergmann,
	AKASHI Takahiro
  Cc: linux-arm-kernel, linux-kernel, live-patching

Check for compiler support of -fpatchable-function-entry and use it
to intercept functions immediately on entry, saving the LR in x9.
patchable-function-entry in GCC disables IPA-RA, which means ABI
register calling conventions are obeyed *and* scratch registers are
available.
Disable ftracing in efi/libstub, because this triggers cross-section
linker errors now (-pg is disabled already for those files).
Add an ftrace_caller which can handle LR in x9, as well as an
ftrace_regs_caller that additionally writes out a set of pt_regs
for inspection.
Introduce and handle an ftrace_regs_trampoline for module PLTs.

Signed-off-by: Torsten Duwe <duwe@suse.de>

--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -110,6 +110,7 @@ config ARM64
 	select HAVE_DEBUG_KMEMLEAK
 	select HAVE_DMA_CONTIGUOUS
 	select HAVE_DYNAMIC_FTRACE
+	select HAVE_DYNAMIC_FTRACE_WITH_REGS
 	select HAVE_EFFICIENT_UNALIGNED_ACCESS
 	select HAVE_FTRACE_MCOUNT_RECORD
 	select HAVE_FUNCTION_TRACER
--- a/arch/arm64/Makefile
+++ b/arch/arm64/Makefile
@@ -78,6 +78,15 @@ ifeq ($(CONFIG_ARM64_MODULE_PLTS),y)
 KBUILD_LDFLAGS_MODULE	+= -T $(srctree)/arch/arm64/kernel/module.lds
 endif
 
+ifeq ($(CONFIG_DYNAMIC_FTRACE_WITH_REGS),y)
+  CC_FLAGS_FTRACE := -fpatchable-function-entry=2
+  KBUILD_CPPFLAGS += -DCC_USING_PATCHABLE_FUNCTION_ENTRY
+  ifeq ($(call cc-option,-fpatchable-function-entry=2),)
+    $(error Cannot use CONFIG_DYNAMIC_FTRACE_WITH_REGS: \
+             -fpatchable-function-entry not supported by compiler)
+  endif
+endif
+
 # Default value
 head-y		:= arch/arm64/kernel/head.o
 
--- a/arch/arm64/include/asm/ftrace.h
+++ b/arch/arm64/include/asm/ftrace.h
@@ -16,6 +16,17 @@
 #define MCOUNT_ADDR		((unsigned long)_mcount)
 #define MCOUNT_INSN_SIZE	AARCH64_INSN_SIZE
 
+/* DYNAMIC_FTRACE_WITH_REGS is implemented by adding 2 NOPs at the beginning
+   of each function, with the second NOP actually calling ftrace. In contrary
+   to a classic _mcount call, the call instruction to be modified is thus
+   the second one, and not the only one. */
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS
+#define ARCH_SUPPORTS_FTRACE_OPS 1
+#define REC_IP_BRANCH_OFFSET AARCH64_INSN_SIZE
+#else
+#define REC_IP_BRANCH_OFFSET 0
+#endif
+
 #ifndef __ASSEMBLY__
 #include <linux/compat.h>
 
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -7,9 +7,9 @@ CPPFLAGS_vmlinux.lds	:= -DTEXT_OFFSET=$(
 AFLAGS_head.o		:= -DTEXT_OFFSET=$(TEXT_OFFSET)
 CFLAGS_armv8_deprecated.o := -I$(src)
 
-CFLAGS_REMOVE_ftrace.o = -pg
-CFLAGS_REMOVE_insn.o = -pg
-CFLAGS_REMOVE_return_address.o = -pg
+CFLAGS_REMOVE_ftrace.o = -pg $(CC_FLAGS_FTRACE)
+CFLAGS_REMOVE_insn.o = -pg $(CC_FLAGS_FTRACE)
+CFLAGS_REMOVE_return_address.o = -pg $(CC_FLAGS_FTRACE)
 
 # Object file lists.
 arm64-obj-y		:= debug-monitors.o entry.o irq.o fpsimd.o		\
--- a/drivers/firmware/efi/libstub/Makefile
+++ b/drivers/firmware/efi/libstub/Makefile
@@ -11,7 +11,8 @@ cflags-$(CONFIG_X86)		+= -m$(BITS) -D__K
 				   -fPIC -fno-strict-aliasing -mno-red-zone \
 				   -mno-mmx -mno-sse -fshort-wchar
 
-cflags-$(CONFIG_ARM64)		:= $(subst -pg,,$(KBUILD_CFLAGS)) -fpie
+cflags-$(CONFIG_ARM64)		:= $(filter-out -pg $(CC_FLAGS_FTRACE)\
+				  ,$(KBUILD_CFLAGS)) -fpie
 cflags-$(CONFIG_ARM)		:= $(subst -pg,,$(KBUILD_CFLAGS)) \
 				   -fno-builtin -fpic -mno-single-pic-base
 
--- a/arch/arm64/kernel/entry-ftrace.S
+++ b/arch/arm64/kernel/entry-ftrace.S
@@ -13,6 +13,8 @@
 #include <asm/assembler.h>
 #include <asm/ftrace.h>
 #include <asm/insn.h>
+#include <asm/asm-offsets.h>
+#include <asm/assembler.h>
 
 /*
  * Gcc with -pg will put the following code in the beginning of each function:
@@ -123,6 +125,7 @@ skip_ftrace_call:			// }
 ENDPROC(_mcount)
 
 #else /* CONFIG_DYNAMIC_FTRACE */
+#ifndef CC_USING_PATCHABLE_FUNCTION_ENTRY
 /*
  * _mcount() is used to build the kernel with -pg option, but all the branch
  * instructions to _mcount() are replaced to NOP initially at kernel start up,
@@ -162,6 +165,92 @@ ftrace_graph_call:			// ftrace_graph_cal
 
 	mcount_exit
 ENDPROC(ftrace_caller)
+#else /* CC_USING_PATCHABLE_FUNCTION_ENTRY */
+
+/* Since no -pg or similar compiler flag is used, there should really be
+   no reference to _mcount; so do not define one. Only a function address
+   is needed in order to refer to it. */
+ENTRY(_mcount)
+	ret	/* just in case, prevent any fall through. */
+ENDPROC(_mcount)
+
+ENTRY(ftrace_regs_caller)
+	sub	sp, sp, #S_FRAME_SIZE
+	stp	x29, x9, [sp, #-16]	/* FP/LR link */
+
+	stp	x10, x11, [sp, #S_X10]
+	stp	x12, x13, [sp, #S_X12]
+	stp	x14, x15, [sp, #112]
+	stp	x16, x17, [sp, #128]
+	stp	x18, x19, [sp, #144]
+	stp	x20, x21, [sp, #160]
+	stp	x22, x23, [sp, #176]
+	stp	x24, x25, [sp, #192]
+	stp	x26, x27, [sp, #208]
+
+	b	ftrace_common
+ENDPROC(ftrace_regs_caller)
+
+ENTRY(ftrace_caller)
+	sub	sp, sp, #S_FRAME_SIZE
+	stp	x29, x9, [sp, #-16]	/* FP/LR link */
+
+ftrace_common:
+	stp	x28, x29, [sp, #224]	/* FP in pt_regs + "our" x28 */
+
+	/* save function arguments */
+	stp	x0, x1, [sp]
+	stp	x2, x3, [sp, #16]
+	stp	x4, x5, [sp, #32]
+	stp	x6, x7, [sp, #48]
+	stp	x8, x9, [sp, #64]
+
+	/* The link Register at callee entry */
+	str	x9, [sp, #S_LR]		/* to pt_regs.r[30] */
+	/* The program counter just after the ftrace call site */
+	str	lr, [sp, #S_PC]
+	/* The stack pointer as it was on ftrace_caller entry... */
+	add	x29, sp, #S_FRAME_SIZE
+	str	x29, [sp, #S_SP]
+
+	ldr_l	x2, function_trace_op, x0
+	mov	x1, x9		/* saved LR == parent IP */
+	sub	x0, lr, #8	/* function entry == IP */
+	mov	x3, sp		/* pt_regs are @sp */
+	sub	sp, sp, #16	/* skip over FP/LR link */
+
+	.global ftrace_call
+ftrace_call:
+
+	bl	ftrace_stub
+
+#ifdef CONFIG_FUNCTION_GRAPH_TRACER
+	.global ftrace_graph_call
+ftrace_graph_call:			// ftrace_graph_caller();
+	nop				// If enabled, this will be replaced
+					// "b ftrace_graph_caller"
+#endif
+
+ftrace_common_return:
+	add	sp, sp, #16	/* advance to pt_regs for restore */
+
+	ldp	x0, x1, [sp]
+	ldp	x2, x3, [sp, #16]
+	ldp	x4, x5, [sp, #32]
+	ldp	x6, x7, [sp, #48]
+	ldp	x8, x9, [sp, #64]
+
+	ldp	x28, x29, [sp, #224]
+
+	ldr	x9, [sp, #S_PC]
+	ldr	lr, [sp, #S_LR]
+	add	sp, sp, #S_FRAME_SIZE
+
+	ret	x9
+
+ENDPROC(ftrace_caller)
+
+#endif /* CC_USING_PATCHABLE_FUNCTION_ENTRY */
 #endif /* CONFIG_DYNAMIC_FTRACE */
 
 ENTRY(ftrace_stub)
@@ -197,12 +286,20 @@ ENDPROC(ftrace_stub)
  * and run return_to_handler() later on its exit.
  */
 ENTRY(ftrace_graph_caller)
+#ifndef CC_USING_PATCHABLE_FUNCTION_ENTRY
 	mcount_get_lr_addr	  x0	//     pointer to function's saved lr
 	mcount_get_pc		  x1	//     function's pc
 	mcount_get_parent_fp	  x2	//     parent's fp
 	bl	prepare_ftrace_return	// prepare_ftrace_return(&lr, pc, fp)
 
 	mcount_exit
+#else
+	add	x0, sp, #S_LR+16	/* address of (LR pointing into caller) */
+	ldr	x1, [sp, #S_PC+16]
+	ldr	x2, [sp, #S_X28 + 8+16]	/* caller's frame pointer */
+	bl	prepare_ftrace_return
+	b	ftrace_common_return
+#endif
 ENDPROC(ftrace_graph_caller)
 
 /*
--- a/arch/arm64/kernel/ftrace.c
+++ b/arch/arm64/kernel/ftrace.c
@@ -65,18 +65,66 @@ int ftrace_update_ftrace_func(ftrace_fun
 	return ftrace_modify_code(pc, 0, new, false);
 }
 
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS
+/* Have the assembler generate a known "mov x9,x30" at compile time. */
+static void notrace noinline __attribute__((used)) mov_x9_x30(void)
+{
+	asm(" .global insn_mov_x9_x30\n"
+		     "insn_mov_x9_x30: mov x9,x30\n" : : : "x9");
+}
+#endif
+
+#ifdef CONFIG_ARM64_MODULE_PLTS
+int use_ftrace_trampoline(struct module *mod, unsigned long *addr)
+{
+	struct plt_entry trampoline;
+	trampoline = get_plt_entry(*addr);
+	if (*addr == FTRACE_ADDR) {
+		if (!plt_entries_equal(mod->arch.ftrace_trampoline,
+				       &trampoline)) {
+
+			/* point the trampoline to our ftrace entry point */
+			module_disable_ro(mod);
+			*mod->arch.ftrace_trampoline = trampoline;
+			module_enable_ro(mod, true);
+
+			/* update trampoline before patching in the branch */
+			smp_wmb();
+		}
+		*addr = (unsigned long)(void *)mod->arch.ftrace_trampoline;
+	}
+	else if (*addr == FTRACE_REGS_ADDR) {
+		if (!plt_entries_equal(mod->arch.ftrace_regs_trampoline,
+				       &trampoline)) {
+
+			/* point the trampoline to our ftrace entry point */
+			module_disable_ro(mod);
+			*mod->arch.ftrace_regs_trampoline = trampoline;
+			module_enable_ro(mod, true);
+
+			/* update trampoline before patching in the branch */
+			smp_wmb();
+		}
+		*addr = (unsigned long)(void *)mod->arch.ftrace_regs_trampoline;
+	}
+	else
+		return -EINVAL;
+	return 0;
+}
+#endif
+
 /*
  * Turn on the call to ftrace_caller() in instrumented function
  */
 int ftrace_make_call(struct dyn_ftrace *rec, unsigned long addr)
 {
-	unsigned long pc = rec->ip;
+	unsigned long pc = rec->ip + REC_IP_BRANCH_OFFSET;
+	int ret;
 	u32 old, new;
 	long offset = (long)pc - (long)addr;
 
 	if (offset < -SZ_128M || offset >= SZ_128M) {
 #ifdef CONFIG_ARM64_MODULE_PLTS
-		struct plt_entry trampoline;
 		struct module *mod;
 
 		/*
@@ -96,54 +144,67 @@ int ftrace_make_call(struct dyn_ftrace *
 		if (WARN_ON(!mod))
 			return -EINVAL;
 
-		/*
-		 * There is only one ftrace trampoline per module. For now,
-		 * this is not a problem since on arm64, all dynamic ftrace
-		 * invocations are routed via ftrace_caller(). This will need
-		 * to be revisited if support for multiple ftrace entry points
-		 * is added in the future, but for now, the pr_err() below
-		 * deals with a theoretical issue only.
-		 */
-		trampoline = get_plt_entry(addr);
-		if (!plt_entries_equal(mod->arch.ftrace_trampoline,
-				       &trampoline)) {
-			if (!plt_entries_equal(mod->arch.ftrace_trampoline,
-					       &(struct plt_entry){})) {
-				pr_err("ftrace: far branches to multiple entry points unsupported inside a single module\n");
-				return -EINVAL;
-			}
-
-			/* point the trampoline to our ftrace entry point */
-			module_disable_ro(mod);
-			*mod->arch.ftrace_trampoline = trampoline;
-			module_enable_ro(mod, true);
-
-			/* update trampoline before patching in the branch */
-			smp_wmb();
+		/* Check against our well-known list of ftrace entry points */
+		if (addr == FTRACE_ADDR || addr == FTRACE_REGS_ADDR) {
+			ret = use_ftrace_trampoline(mod, &addr);
+			if (ret < 0)
+				return ret;
 		}
-		addr = (unsigned long)(void *)mod->arch.ftrace_trampoline;
+		else
+			return -EINVAL;
+
 #else /* CONFIG_ARM64_MODULE_PLTS */
 		return -EINVAL;
 #endif /* CONFIG_ARM64_MODULE_PLTS */
 	}
 
 	old = aarch64_insn_gen_nop();
+	if (IS_ENABLED(CONFIG_DYNAMIC_FTRACE_WITH_REGS)) {
+		new = *(u32*)&mov_x9_x30;
+		ret = ftrace_modify_code(pc - REC_IP_BRANCH_OFFSET,
+					 old, new, true);
+		if (ret)
+			return ret;
+		smp_wmb(); /* ensure LR saver is in place before ftrace call */
+	}
 	new = aarch64_insn_gen_branch_imm(pc, addr, AARCH64_INSN_BRANCH_LINK);
 
 	return ftrace_modify_code(pc, old, new, true);
 }
 
+int ftrace_modify_call(struct dyn_ftrace *rec, unsigned long old_addr,
+			unsigned long addr)
+{
+	unsigned long pc = rec->ip + REC_IP_BRANCH_OFFSET;
+	u32 old, new;
+
+	old = aarch64_insn_gen_branch_imm(pc, old_addr, true);
+	new = aarch64_insn_gen_branch_imm(pc, addr, true);
+
+	return ftrace_modify_code(pc, old, new, true);
+}
+
+
 /*
  * Turn off the call to ftrace_caller() in instrumented function
  */
 int ftrace_make_nop(struct module *mod, struct dyn_ftrace *rec,
 		    unsigned long addr)
 {
-	unsigned long pc = rec->ip;
+	unsigned long pc = rec->ip + REC_IP_BRANCH_OFFSET;
 	bool validate = true;
+	int ret;
 	u32 old = 0, new;
 	long offset = (long)pc - (long)addr;
 
+#ifdef CC_USING_PATCHABLE_FUNCTION_ENTRY
+	/* -fpatchable-function-entry= does not generate a profiling call
+	 *  initially; the NOPs are already there.
+	 */
+	if (addr == MCOUNT_ADDR)
+		return 0;
+#endif
+
 	if (offset < -SZ_128M || offset >= SZ_128M) {
 #ifdef CONFIG_ARM64_MODULE_PLTS
 		u32 replaced;
@@ -188,7 +249,16 @@ int ftrace_make_nop(struct module *mod,
 
 	new = aarch64_insn_gen_nop();
 
-	return ftrace_modify_code(pc, old, new, validate);
+	ret = ftrace_modify_code(pc, old, new, validate);
+	if (ret)
+		return ret;
+	if (IS_ENABLED(CONFIG_DYNAMIC_FTRACE_WITH_REGS)) {
+		smp_wmb(); /* ftrace call must not remain without LR saver. */
+		old = *(u32*)&mov_x9_x30;
+		ret = ftrace_modify_code(pc - REC_IP_BRANCH_OFFSET,
+					 old, new, true);
+	}
+	return ret;
 }
 
 void arch_ftrace_update_code(int command)
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -115,6 +115,7 @@
 #define MCOUNT_REC()	. = ALIGN(8);				\
 			__start_mcount_loc = .;			\
 			KEEP(*(__mcount_loc))			\
+			KEEP(*(__patchable_function_entries))	\
 			__stop_mcount_loc = .;
 #else
 #define MCOUNT_REC()
--- a/include/linux/compiler_types.h
+++ b/include/linux/compiler_types.h
@@ -61,8 +61,12 @@ extern void __chk_io_ptr(const volatile
 #if defined(CC_USING_HOTPATCH) && !defined(__CHECKER__)
 #define notrace __attribute__((hotpatch(0,0)))
 #else
+#ifdef CC_USING_PATCHABLE_FUNCTION_ENTRY
+#define notrace __attribute__((patchable_function_entry(0)))
+#else
 #define notrace __attribute__((no_instrument_function))
 #endif
+#endif
 
 /* Intel compiler defines __GNUC__. So we will overwrite implementations
  * coming from above header files here
--- a/arch/arm64/include/asm/module.h
+++ b/arch/arm64/include/asm/module.h
@@ -33,6 +33,7 @@ struct mod_arch_specific {
 
 	/* for CONFIG_DYNAMIC_FTRACE */
 	struct plt_entry 	*ftrace_trampoline;
+	struct plt_entry 	*ftrace_regs_trampoline;
 };
 #endif
 
--- a/arch/arm64/kernel/module.c
+++ b/arch/arm64/kernel/module.c
@@ -454,6 +454,10 @@ int module_finalize(const Elf_Ehdr *hdr,
 		if (IS_ENABLED(CONFIG_DYNAMIC_FTRACE) &&
 		    !strcmp(".text.ftrace_trampoline", secstrs + s->sh_name))
 			me->arch.ftrace_trampoline = (void *)s->sh_addr;
+		if (IS_ENABLED(CONFIG_DYNAMIC_FTRACE_WITH_REGS) &&
+		    !strcmp(".text.ftrace_regs_trampoline",
+				secstrs + s->sh_name))
+			me->arch.ftrace_regs_trampoline = (void *)s->sh_addr;
 #endif
 	}
 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH v3 3/4] arm64: implement live patching
  2018-10-01 14:09 [PATCH v3 0/4] arm64 live patching Torsten Duwe
  2018-10-01 14:16 ` [PATCH v3 1/4] DYNAMIC_FTRACE configurable with and without REGS Torsten Duwe
  2018-10-01 14:16 ` [PATCH v3 2/4] arm64: implement ftrace with regs Torsten Duwe
@ 2018-10-01 14:16 ` Torsten Duwe
  2018-10-17 13:39   ` Miroslav Benes
                     ` (2 more replies)
  2018-10-01 14:16 ` [PATCH v3 4/4] arm64: reliable stacktraces Torsten Duwe
  3 siblings, 3 replies; 48+ messages in thread
From: Torsten Duwe @ 2018-10-01 14:16 UTC (permalink / raw)
  To: Will Deacon, Catalin Marinas, Julien Thierry, Steven Rostedt,
	Josh Poimboeuf, Ingo Molnar, Ard Biesheuvel, Arnd Bergmann,
	AKASHI Takahiro
  Cc: linux-arm-kernel, linux-kernel, live-patching

Based on ftrace with regs, do the usual thing. Also allocate a
task flag for whatever consistency handling will be used.
Watch out for interactions with the graph tracer.

Signed-off-by: Torsten Duwe <duwe@suse.de>

--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -119,6 +119,7 @@ config ARM64
 	select HAVE_GENERIC_DMA_COHERENT
 	select HAVE_HW_BREAKPOINT if PERF_EVENTS
 	select HAVE_IRQ_TIME_ACCOUNTING
+	select HAVE_LIVEPATCH
 	select HAVE_MEMBLOCK
 	select HAVE_MEMBLOCK_NODE_MAP if NUMA
 	select HAVE_NMI
@@ -1349,4 +1350,6 @@ if CRYPTO
 source "arch/arm64/crypto/Kconfig"
 endif
 
+source "kernel/livepatch/Kconfig"
+
 source "lib/Kconfig"
--- a/arch/arm64/include/asm/thread_info.h
+++ b/arch/arm64/include/asm/thread_info.h
@@ -76,6 +76,7 @@ void arch_release_task_struct(struct tas
 #define TIF_FOREIGN_FPSTATE	3	/* CPU's FP state is not current's */
 #define TIF_UPROBE		4	/* uprobe breakpoint or singlestep */
 #define TIF_FSCHECK		5	/* Check FS is USER_DS on return */
+#define TIF_PATCH_PENDING	6
 #define TIF_NOHZ		7
 #define TIF_SYSCALL_TRACE	8
 #define TIF_SYSCALL_AUDIT	9
@@ -94,6 +95,7 @@ void arch_release_task_struct(struct tas
 #define _TIF_NEED_RESCHED	(1 << TIF_NEED_RESCHED)
 #define _TIF_NOTIFY_RESUME	(1 << TIF_NOTIFY_RESUME)
 #define _TIF_FOREIGN_FPSTATE	(1 << TIF_FOREIGN_FPSTATE)
+#define _TIF_PATCH_PENDING	(1 << TIF_PATCH_PENDING)
 #define _TIF_NOHZ		(1 << TIF_NOHZ)
 #define _TIF_SYSCALL_TRACE	(1 << TIF_SYSCALL_TRACE)
 #define _TIF_SYSCALL_AUDIT	(1 << TIF_SYSCALL_AUDIT)
@@ -106,7 +108,8 @@ void arch_release_task_struct(struct tas
 
 #define _TIF_WORK_MASK		(_TIF_NEED_RESCHED | _TIF_SIGPENDING | \
 				 _TIF_NOTIFY_RESUME | _TIF_FOREIGN_FPSTATE | \
-				 _TIF_UPROBE | _TIF_FSCHECK)
+				 _TIF_UPROBE | _TIF_FSCHECK | \
+				 _TIF_PATCH_PENDING)
 
 #define _TIF_SYSCALL_WORK	(_TIF_SYSCALL_TRACE | _TIF_SYSCALL_AUDIT | \
 				 _TIF_SYSCALL_TRACEPOINT | _TIF_SECCOMP | \
--- /dev/null
+++ b/arch/arm64/include/asm/livepatch.h
@@ -0,0 +1,36 @@
+/* SPDX-License-Identifier: GPL-2.0
+ *
+ * livepatch.h - arm64-specific Kernel Live Patching Core
+ *
+ * Copyright (C) 2016,2018 SUSE
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef _ASM_ARM64_LIVEPATCH_H
+#define _ASM_ARM64_LIVEPATCH_H
+
+#include <linux/module.h>
+#include <linux/ftrace.h>
+
+static inline int klp_check_compiler_support(void)
+{
+	return 0;
+}
+
+static inline void klp_arch_set_pc(struct pt_regs *regs, unsigned long ip)
+{
+	regs->pc = ip;
+}
+
+#endif /* _ASM_ARM64_LIVEPATCH_H */
--- a/arch/arm64/kernel/entry-ftrace.S
+++ b/arch/arm64/kernel/entry-ftrace.S
@@ -209,6 +209,9 @@ ftrace_common:
 	str	x9, [sp, #S_LR]		/* to pt_regs.r[30] */
 	/* The program counter just after the ftrace call site */
 	str	lr, [sp, #S_PC]
+#if defined(CONFIG_LIVEPATCH) && defined(CONFIG_FUNCTION_GRAPH_TRACER)
+	mov	x28,lr		/* remember old return address */
+#endif
 	/* The stack pointer as it was on ftrace_caller entry... */
 	add	x29, sp, #S_FRAME_SIZE
 	str	x29, [sp, #S_SP]
@@ -224,6 +227,16 @@ ftrace_call:
 
 	bl	ftrace_stub
 
+#if defined(CONFIG_LIVEPATCH) && defined(CONFIG_FUNCTION_GRAPH_TRACER)
+	/* Is the trace function a live patcher an has messed with
+	 * the return address?
+	*/
+	ldr	x9, [sp, #S_PC+16]
+	cmp	x9, x28		/* compare with the value we remembered */
+	/* to not call graph tracer's "call" mechanism twice! */
+	b.ne	ftrace_common_return
+#endif
+
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
 	.global ftrace_graph_call
 ftrace_graph_call:			// ftrace_graph_caller();
--- a/arch/arm64/kernel/signal.c
+++ b/arch/arm64/kernel/signal.c
@@ -29,6 +29,7 @@
 #include <linux/sizes.h>
 #include <linux/string.h>
 #include <linux/tracehook.h>
+#include <linux/livepatch.h>
 #include <linux/ratelimit.h>
 #include <linux/syscalls.h>
 
@@ -934,6 +935,9 @@ asmlinkage void do_notify_resume(struct
 			if (thread_flags & _TIF_UPROBE)
 				uprobe_notify_resume(regs);
 
+			if (thread_flags & _TIF_PATCH_PENDING)
+				klp_update_patch_state(current);
+
 			if (thread_flags & _TIF_SIGPENDING)
 				do_signal(regs);
 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH v3 4/4] arm64: reliable stacktraces
  2018-10-01 14:09 [PATCH v3 0/4] arm64 live patching Torsten Duwe
                   ` (2 preceding siblings ...)
  2018-10-01 14:16 ` [PATCH v3 3/4] arm64: implement live patching Torsten Duwe
@ 2018-10-01 14:16 ` Torsten Duwe
  3 siblings, 0 replies; 48+ messages in thread
From: Torsten Duwe @ 2018-10-01 14:16 UTC (permalink / raw)
  To: Will Deacon, Catalin Marinas, Julien Thierry, Steven Rostedt,
	Josh Poimboeuf, Ingo Molnar, Ard Biesheuvel, Arnd Bergmann,
	AKASHI Takahiro
  Cc: linux-arm-kernel, linux-kernel, live-patching

Make unwind_frame() report whether it had to stop normally or due to
an error condition; walk_stackframe() will pass that info.
__save_stack_trace() is used to check the validity of a frame;
save_stack_trace_tsk_reliable() can now trivially be implemented.
Modify arch/arm64/kernel/time.c for the new semantics.

Signed-off-by: Torsten Duwe <duwe@suse.de>

--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -127,8 +127,9 @@ config ARM64
 	select HAVE_PERF_EVENTS
 	select HAVE_PERF_REGS
 	select HAVE_PERF_USER_STACK_DUMP
-	select HAVE_REGS_AND_STACK_ACCESS_API
 	select HAVE_RCU_TABLE_FREE
+	select HAVE_REGS_AND_STACK_ACCESS_API
+	select HAVE_RELIABLE_STACKTRACE
 	select HAVE_STACKPROTECTOR
 	select HAVE_SYSCALL_TRACEPOINTS
 	select HAVE_KPROBES
--- a/arch/arm64/include/asm/stacktrace.h
+++ b/arch/arm64/include/asm/stacktrace.h
@@ -33,7 +33,7 @@ struct stackframe {
 };
 
 extern int unwind_frame(struct task_struct *tsk, struct stackframe *frame);
-extern void walk_stackframe(struct task_struct *tsk, struct stackframe *frame,
+extern int walk_stackframe(struct task_struct *tsk, struct stackframe *frame,
 			    int (*fn)(struct stackframe *, void *), void *data);
 extern void dump_backtrace(struct pt_regs *regs, struct task_struct *tsk);
 
--- a/arch/arm64/kernel/stacktrace.c
+++ b/arch/arm64/kernel/stacktrace.c
@@ -40,6 +40,13 @@
  *	ldp	x29, x30, [sp]
  *	add	sp, sp, #0x10
  */
+
+/*
+ * unwind_frame -- unwind a single stack frame.
+ * Returns 0 when there are more frames to go.
+ * 1 means reached end of stack; negative (error)
+ * means stopped because information is not reliable.
+ */
 int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
 {
 	unsigned long fp = frame->fp;
@@ -81,23 +88,27 @@ int notrace unwind_frame(struct task_str
 	 * both are NULL.
 	 */
 	if (!frame->fp && !frame->pc)
-		return -EINVAL;
+		return 1;
 
 	return 0;
 }
 
-void notrace walk_stackframe(struct task_struct *tsk, struct stackframe *frame,
+int notrace walk_stackframe(struct task_struct *tsk, struct stackframe *frame,
 		     int (*fn)(struct stackframe *, void *), void *data)
 {
 	while (1) {
 		int ret;
 
-		if (fn(frame, data))
-			break;
+		ret = fn(frame, data);
+		if (ret)
+			return ret;
 		ret = unwind_frame(tsk, frame);
 		if (ret < 0)
+			return ret;
+		if (ret > 0)
 			break;
 	}
+	return 0;
 }
 
 #ifdef CONFIG_STACKTRACE
@@ -145,14 +156,15 @@ void save_stack_trace_regs(struct pt_reg
 		trace->entries[trace->nr_entries++] = ULONG_MAX;
 }
 
-static noinline void __save_stack_trace(struct task_struct *tsk,
+static noinline int __save_stack_trace(struct task_struct *tsk,
 	struct stack_trace *trace, unsigned int nosched)
 {
 	struct stack_trace_data data;
 	struct stackframe frame;
+	int ret;
 
 	if (!try_get_task_stack(tsk))
-		return;
+		return -EBUSY;
 
 	data.trace = trace;
 	data.skip = trace->skip;
@@ -171,11 +183,12 @@ static noinline void __save_stack_trace(
 	frame.graph = tsk->curr_ret_stack;
 #endif
 
-	walk_stackframe(tsk, &frame, save_trace, &data);
+	ret = walk_stackframe(tsk, &frame, save_trace, &data);
 	if (trace->nr_entries < trace->max_entries)
 		trace->entries[trace->nr_entries++] = ULONG_MAX;
 
 	put_task_stack(tsk);
+	return ret;
 }
 EXPORT_SYMBOL_GPL(save_stack_trace_tsk);
 
@@ -190,4 +203,12 @@ void save_stack_trace(struct stack_trace
 }
 
 EXPORT_SYMBOL_GPL(save_stack_trace);
+
+int save_stack_trace_tsk_reliable(struct task_struct *tsk,
+				  struct stack_trace *trace)
+{
+	return __save_stack_trace(tsk, trace, 1);
+}
+EXPORT_SYMBOL_GPL(save_stack_trace_tsk_reliable);
+
 #endif
--- a/arch/arm64/kernel/time.c
+++ b/arch/arm64/kernel/time.c
@@ -56,7 +56,7 @@ unsigned long profile_pc(struct pt_regs
 #endif
 	do {
 		int ret = unwind_frame(NULL, &frame);
-		if (ret < 0)
+		if (ret)
 			return 0;
 	} while (in_lock_functions(frame.pc));
 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v3 1/4] DYNAMIC_FTRACE configurable with and without REGS
  2018-10-01 14:16 ` [PATCH v3 1/4] DYNAMIC_FTRACE configurable with and without REGS Torsten Duwe
@ 2018-10-01 14:52   ` Ard Biesheuvel
  2018-10-01 15:03     ` Torsten Duwe
  0 siblings, 1 reply; 48+ messages in thread
From: Ard Biesheuvel @ 2018-10-01 14:52 UTC (permalink / raw)
  To: Torsten Duwe
  Cc: Masami Hiramatsu, Will Deacon, Catalin Marinas, Julien Thierry,
	Steven Rostedt, Josh Poimboeuf, Ingo Molnar, Arnd Bergmann,
	AKASHI Takahiro, linux-arm-kernel, Linux Kernel Mailing List,
	live-patching

Hi Torsten,

On 1 October 2018 at 16:16, Torsten Duwe <duwe@lst.de> wrote:
> In commit 06aeaaeabf69da4, many ftrace-related config options are
> consolidated. By accident, I guess, the choice about DYNAMIC_FTRACE
> and DYNAMIC_FTRACE_WITH_REGS is no longer available explicitly but
> determined by the sole availability on the architecture.
>
> This makes it hard to introduce DYNAMIC_FTRACE_WITH_REGS if it depends
> on new compiler features or other new properties of the toolchain
> without breaking existing configurations.
>
> This patch turns the def_bool into an actual choice. Should the toolchain
> not meet the requirements for _WITH_REGS it can be turned off.
>

I guess we now have Kbuild/Kconfig support for this, no? I mean, we
can now show/hide options depending on the capabilities of the
toolchain.

I am not saying it would be a better approach, though - I'd rather
have a warning than have things silently disabled, but other people
may think differently.


> Signed-off-by: Torsten Duwe <duwe@suse.de>
>
>
> --- a/kernel/trace/Kconfig
> +++ b/kernel/trace/Kconfig
> @@ -508,9 +508,15 @@ config DYNAMIC_FTRACE
>           otherwise has native performance as long as no tracing is active.
>
>  config DYNAMIC_FTRACE_WITH_REGS
> -       def_bool y
> +       bool "Include register content tracking in dynamic ftrace facility"
> +       default y
>         depends on DYNAMIC_FTRACE
>         depends on HAVE_DYNAMIC_FTRACE_WITH_REGS
> +       help
> +         This architecture supports the inspection of register contents,
> +         as passed between functions, at the dynamic ftrace points.
> +         This is also a prerequisite for Kernel Live Patching (KLP).
> +         When in doubt, say Y.
>
>  config FUNCTION_PROFILER
>         bool "Kernel function profiler"

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v3 1/4] DYNAMIC_FTRACE configurable with and without REGS
  2018-10-01 14:52   ` Ard Biesheuvel
@ 2018-10-01 15:03     ` Torsten Duwe
  2018-10-01 15:06       ` Ard Biesheuvel
  0 siblings, 1 reply; 48+ messages in thread
From: Torsten Duwe @ 2018-10-01 15:03 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Masami Hiramatsu, Will Deacon, Catalin Marinas, Julien Thierry,
	Steven Rostedt, Josh Poimboeuf, Ingo Molnar, Arnd Bergmann,
	AKASHI Takahiro, linux-arm-kernel, Linux Kernel Mailing List,
	live-patching

On Mon, Oct 01, 2018 at 04:52:27PM +0200, Ard Biesheuvel wrote:
> 
> I guess we now have Kbuild/Kconfig support for this, no? I mean, we
> can now show/hide options depending on the capabilities of the
> toolchain.

Config options depending on flags availability?

> I am not saying it would be a better approach, though - I'd rather
> have a warning than have things silently disabled, but other people
> may think differently.

Even then this switch has to be enabled, no matter who or what sets it.
Note that this patch leaves everything as-is. Only arm64 users with
"old" compilers would need to take action.

	Torsten


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v3 1/4] DYNAMIC_FTRACE configurable with and without REGS
  2018-10-01 15:03     ` Torsten Duwe
@ 2018-10-01 15:06       ` Ard Biesheuvel
  2018-10-01 15:10         ` Torsten Duwe
  0 siblings, 1 reply; 48+ messages in thread
From: Ard Biesheuvel @ 2018-10-01 15:06 UTC (permalink / raw)
  To: Torsten Duwe
  Cc: Masami Hiramatsu, Will Deacon, Catalin Marinas, Julien Thierry,
	Steven Rostedt, Josh Poimboeuf, Ingo Molnar, Arnd Bergmann,
	AKASHI Takahiro, linux-arm-kernel, Linux Kernel Mailing List,
	live-patching

On 1 October 2018 at 17:03, Torsten Duwe <duwe@lst.de> wrote:
> On Mon, Oct 01, 2018 at 04:52:27PM +0200, Ard Biesheuvel wrote:
>>
>> I guess we now have Kbuild/Kconfig support for this, no? I mean, we
>> can now show/hide options depending on the capabilities of the
>> toolchain.
>
> Config options depending on flags availability?
>

Yes. Note that 'make menuconfig' now prints the compiler version at
the top, and kconfig options can now 'depend' on compiler features,
although I must admit I don't know how it works.

>> I am not saying it would be a better approach, though - I'd rather
>> have a warning than have things silently disabled, but other people
>> may think differently.
>
> Even then this switch has to be enabled, no matter who or what sets it.
> Note that this patch leaves everything as-is. Only arm64 users with
> "old" compilers would need to take action.
>
>         Torsten
>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v3 1/4] DYNAMIC_FTRACE configurable with and without REGS
  2018-10-01 15:06       ` Ard Biesheuvel
@ 2018-10-01 15:10         ` Torsten Duwe
  2018-10-01 15:14           ` Steven Rostedt
  0 siblings, 1 reply; 48+ messages in thread
From: Torsten Duwe @ 2018-10-01 15:10 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Masami Hiramatsu, Will Deacon, Catalin Marinas, Julien Thierry,
	Steven Rostedt, Josh Poimboeuf, Ingo Molnar, Arnd Bergmann,
	AKASHI Takahiro, linux-arm-kernel, Linux Kernel Mailing List,
	live-patching

On Mon, Oct 01, 2018 at 05:06:06PM +0200, Ard Biesheuvel wrote:
> On 1 October 2018 at 17:03, Torsten Duwe <duwe@lst.de> wrote:
> > On Mon, Oct 01, 2018 at 04:52:27PM +0200, Ard Biesheuvel wrote:
> >>
> >> I guess we now have Kbuild/Kconfig support for this, no? I mean, we
> >> can now show/hide options depending on the capabilities of the
> >> toolchain.
> >
> > Config options depending on flags availability?
> >
> Yes. Note that 'make menuconfig' now prints the compiler version at
> the top, and kconfig options can now 'depend' on compiler features,

Ah, cool, got it. So unless anyone else thinks this patch is useful,
feel free to disregard it ;-) Point taken.

	Torsten


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v3 1/4] DYNAMIC_FTRACE configurable with and without REGS
  2018-10-01 15:10         ` Torsten Duwe
@ 2018-10-01 15:14           ` Steven Rostedt
  0 siblings, 0 replies; 48+ messages in thread
From: Steven Rostedt @ 2018-10-01 15:14 UTC (permalink / raw)
  To: Torsten Duwe
  Cc: Ard Biesheuvel, Masami Hiramatsu, Will Deacon, Catalin Marinas,
	Julien Thierry, Josh Poimboeuf, Ingo Molnar, Arnd Bergmann,
	AKASHI Takahiro, linux-arm-kernel, Linux Kernel Mailing List,
	live-patching

On Mon, 1 Oct 2018 17:10:37 +0200
Torsten Duwe <duwe@lst.de> wrote:

> On Mon, Oct 01, 2018 at 05:06:06PM +0200, Ard Biesheuvel wrote:
> > On 1 October 2018 at 17:03, Torsten Duwe <duwe@lst.de> wrote:  
> > > On Mon, Oct 01, 2018 at 04:52:27PM +0200, Ard Biesheuvel wrote:  
> > >>
> > >> I guess we now have Kbuild/Kconfig support for this, no? I mean, we
> > >> can now show/hide options depending on the capabilities of the
> > >> toolchain.  
> > >
> > > Config options depending on flags availability?
> > >  
> > Yes. Note that 'make menuconfig' now prints the compiler version at
> > the top, and kconfig options can now 'depend' on compiler features,  
> 
> Ah, cool, got it. So unless anyone else thinks this patch is useful,
> feel free to disregard it ;-) Point taken.
>

Yes, please don't apply this patch. Have the build system figure out if
the tool chain can handle it or not.

-- Steve

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v3 2/4] arm64: implement ftrace with regs
  2018-10-01 14:16 ` [PATCH v3 2/4] arm64: implement ftrace with regs Torsten Duwe
@ 2018-10-01 15:57   ` Ard Biesheuvel
  2018-10-02 10:02     ` Torsten Duwe
  2018-10-02 11:27   ` Mark Rutland
  1 sibling, 1 reply; 48+ messages in thread
From: Ard Biesheuvel @ 2018-10-01 15:57 UTC (permalink / raw)
  To: Torsten Duwe
  Cc: Will Deacon, Catalin Marinas, Julien Thierry, Steven Rostedt,
	Josh Poimboeuf, Ingo Molnar, Arnd Bergmann, AKASHI Takahiro,
	linux-arm-kernel, Linux Kernel Mailing List, live-patching

On 1 October 2018 at 16:16, Torsten Duwe <duwe@lst.de> wrote:
> Check for compiler support of -fpatchable-function-entry and use it
> to intercept functions immediately on entry, saving the LR in x9.
> patchable-function-entry in GCC disables IPA-RA, which means ABI
> register calling conventions are obeyed *and* scratch registers are
> available.
> Disable ftracing in efi/libstub, because this triggers cross-section
> linker errors now (-pg is disabled already for those files).
> Add an ftrace_caller which can handle LR in x9, as well as an
> ftrace_regs_caller that additionally writes out a set of pt_regs
> for inspection.
> Introduce and handle an ftrace_regs_trampoline for module PLTs.
>
> Signed-off-by: Torsten Duwe <duwe@suse.de>
>
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -110,6 +110,7 @@ config ARM64
>         select HAVE_DEBUG_KMEMLEAK
>         select HAVE_DMA_CONTIGUOUS
>         select HAVE_DYNAMIC_FTRACE
> +       select HAVE_DYNAMIC_FTRACE_WITH_REGS
>         select HAVE_EFFICIENT_UNALIGNED_ACCESS
>         select HAVE_FTRACE_MCOUNT_RECORD
>         select HAVE_FUNCTION_TRACER
> --- a/arch/arm64/Makefile
> +++ b/arch/arm64/Makefile
> @@ -78,6 +78,15 @@ ifeq ($(CONFIG_ARM64_MODULE_PLTS),y)
>  KBUILD_LDFLAGS_MODULE  += -T $(srctree)/arch/arm64/kernel/module.lds
>  endif
>
> +ifeq ($(CONFIG_DYNAMIC_FTRACE_WITH_REGS),y)
> +  CC_FLAGS_FTRACE := -fpatchable-function-entry=2
> +  KBUILD_CPPFLAGS += -DCC_USING_PATCHABLE_FUNCTION_ENTRY
> +  ifeq ($(call cc-option,-fpatchable-function-entry=2),)
> +    $(error Cannot use CONFIG_DYNAMIC_FTRACE_WITH_REGS: \
> +             -fpatchable-function-entry not supported by compiler)
> +  endif
> +endif
> +
>  # Default value
>  head-y         := arch/arm64/kernel/head.o
>
> --- a/arch/arm64/include/asm/ftrace.h
> +++ b/arch/arm64/include/asm/ftrace.h
> @@ -16,6 +16,17 @@
>  #define MCOUNT_ADDR            ((unsigned long)_mcount)
>  #define MCOUNT_INSN_SIZE       AARCH64_INSN_SIZE
>
> +/* DYNAMIC_FTRACE_WITH_REGS is implemented by adding 2 NOPs at the beginning
> +   of each function, with the second NOP actually calling ftrace. In contrary
> +   to a classic _mcount call, the call instruction to be modified is thus
> +   the second one, and not the only one. */

OK, so the first slot will be patched unconditionally to do the 'mov x9, x30' ?

> +#ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS
> +#define ARCH_SUPPORTS_FTRACE_OPS 1
> +#define REC_IP_BRANCH_OFFSET AARCH64_INSN_SIZE
> +#else
> +#define REC_IP_BRANCH_OFFSET 0
> +#endif
> +
>  #ifndef __ASSEMBLY__
>  #include <linux/compat.h>
>
> --- a/arch/arm64/kernel/Makefile
> +++ b/arch/arm64/kernel/Makefile
> @@ -7,9 +7,9 @@ CPPFLAGS_vmlinux.lds    := -DTEXT_OFFSET=$(
>  AFLAGS_head.o          := -DTEXT_OFFSET=$(TEXT_OFFSET)
>  CFLAGS_armv8_deprecated.o := -I$(src)
>
> -CFLAGS_REMOVE_ftrace.o = -pg
> -CFLAGS_REMOVE_insn.o = -pg
> -CFLAGS_REMOVE_return_address.o = -pg
> +CFLAGS_REMOVE_ftrace.o = -pg $(CC_FLAGS_FTRACE)
> +CFLAGS_REMOVE_insn.o = -pg $(CC_FLAGS_FTRACE)
> +CFLAGS_REMOVE_return_address.o = -pg $(CC_FLAGS_FTRACE)
>
>  # Object file lists.
>  arm64-obj-y            := debug-monitors.o entry.o irq.o fpsimd.o              \
> --- a/drivers/firmware/efi/libstub/Makefile
> +++ b/drivers/firmware/efi/libstub/Makefile
> @@ -11,7 +11,8 @@ cflags-$(CONFIG_X86)          += -m$(BITS) -D__K
>                                    -fPIC -fno-strict-aliasing -mno-red-zone \
>                                    -mno-mmx -mno-sse -fshort-wchar
>
> -cflags-$(CONFIG_ARM64)         := $(subst -pg,,$(KBUILD_CFLAGS)) -fpie
> +cflags-$(CONFIG_ARM64)         := $(filter-out -pg $(CC_FLAGS_FTRACE)\
> +                                 ,$(KBUILD_CFLAGS)) -fpie
>  cflags-$(CONFIG_ARM)           := $(subst -pg,,$(KBUILD_CFLAGS)) \
>                                    -fno-builtin -fpic -mno-single-pic-base
>
> --- a/arch/arm64/kernel/entry-ftrace.S
> +++ b/arch/arm64/kernel/entry-ftrace.S
> @@ -13,6 +13,8 @@
>  #include <asm/assembler.h>
>  #include <asm/ftrace.h>
>  #include <asm/insn.h>
> +#include <asm/asm-offsets.h>
> +#include <asm/assembler.h>
>
>  /*
>   * Gcc with -pg will put the following code in the beginning of each function:
> @@ -123,6 +125,7 @@ skip_ftrace_call:                   // }
>  ENDPROC(_mcount)
>
>  #else /* CONFIG_DYNAMIC_FTRACE */
> +#ifndef CC_USING_PATCHABLE_FUNCTION_ENTRY
>  /*
>   * _mcount() is used to build the kernel with -pg option, but all the branch
>   * instructions to _mcount() are replaced to NOP initially at kernel start up,
> @@ -162,6 +165,92 @@ ftrace_graph_call:                 // ftrace_graph_cal
>
>         mcount_exit
>  ENDPROC(ftrace_caller)
> +#else /* CC_USING_PATCHABLE_FUNCTION_ENTRY */
> +
> +/* Since no -pg or similar compiler flag is used, there should really be
> +   no reference to _mcount; so do not define one. Only a function address
> +   is needed in order to refer to it. */
> +ENTRY(_mcount)
> +       ret     /* just in case, prevent any fall through. */
> +ENDPROC(_mcount)
> +
> +ENTRY(ftrace_regs_caller)
> +       sub     sp, sp, #S_FRAME_SIZE
> +       stp     x29, x9, [sp, #-16]     /* FP/LR link */
> +

You cannot write below the stack pointer. So you are missing a
trailing ! here. Note that you can fold the sub

stp x29, x9, [sp, #-(S_FRAME_SIZE+16)]!

> +       stp     x10, x11, [sp, #S_X10]
> +       stp     x12, x13, [sp, #S_X12]
> +       stp     x14, x15, [sp, #112]
> +       stp     x16, x17, [sp, #128]
> +       stp     x18, x19, [sp, #144]
> +       stp     x20, x21, [sp, #160]
> +       stp     x22, x23, [sp, #176]
> +       stp     x24, x25, [sp, #192]
> +       stp     x26, x27, [sp, #208]
> +

All these will shift by 16 bytes though

I am now wondering if it wouldn't be better to create 2 stack frames:
one for the interrupted function, and one for this function.

So something like

stp x29, x9, [sp, #-16]!
mov x29, sp

stp x29, x30, [sp, #-(S_FRAME_SIZE + 16]!

... store all registers including x29 ...

and do another mov x29, sp before calling into the handler. That way
everything should be visible on the call stack when we do a backtrace.


> +       b       ftrace_common
> +ENDPROC(ftrace_regs_caller)
> +
> +ENTRY(ftrace_caller)
> +       sub     sp, sp, #S_FRAME_SIZE
> +       stp     x29, x9, [sp, #-16]     /* FP/LR link */
> +

Same as above

> +ftrace_common:
> +       stp     x28, x29, [sp, #224]    /* FP in pt_regs + "our" x28 */
> +
> +       /* save function arguments */
> +       stp     x0, x1, [sp]
> +       stp     x2, x3, [sp, #16]
> +       stp     x4, x5, [sp, #32]
> +       stp     x6, x7, [sp, #48]
> +       stp     x8, x9, [sp, #64]
> +
> +       /* The link Register at callee entry */
> +       str     x9, [sp, #S_LR]         /* to pt_regs.r[30] */
> +       /* The program counter just after the ftrace call site */
> +       str     lr, [sp, #S_PC]
> +       /* The stack pointer as it was on ftrace_caller entry... */
> +       add     x29, sp, #S_FRAME_SIZE
> +       str     x29, [sp, #S_SP]
> +
> +       ldr_l   x2, function_trace_op, x0
> +       mov     x1, x9          /* saved LR == parent IP */
> +       sub     x0, lr, #8      /* function entry == IP */
> +       mov     x3, sp          /* pt_regs are @sp */
> +       sub     sp, sp, #16     /* skip over FP/LR link */
> +
> +       .global ftrace_call
> +ftrace_call:
> +
> +       bl      ftrace_stub
> +
> +#ifdef CONFIG_FUNCTION_GRAPH_TRACER
> +       .global ftrace_graph_call
> +ftrace_graph_call:                     // ftrace_graph_caller();
> +       nop                             // If enabled, this will be replaced
> +                                       // "b ftrace_graph_caller"
> +#endif
> +
> +ftrace_common_return:
> +       add     sp, sp, #16     /* advance to pt_regs for restore */
> +
> +       ldp     x0, x1, [sp]
> +       ldp     x2, x3, [sp, #16]
> +       ldp     x4, x5, [sp, #32]
> +       ldp     x6, x7, [sp, #48]
> +       ldp     x8, x9, [sp, #64]
> +
> +       ldp     x28, x29, [sp, #224]
> +
> +       ldr     x9, [sp, #S_PC]
> +       ldr     lr, [sp, #S_LR]
> +       add     sp, sp, #S_FRAME_SIZE
> +
> +       ret     x9
> +
> +ENDPROC(ftrace_caller)
> +
> +#endif /* CC_USING_PATCHABLE_FUNCTION_ENTRY */
>  #endif /* CONFIG_DYNAMIC_FTRACE */
>
>  ENTRY(ftrace_stub)
> @@ -197,12 +286,20 @@ ENDPROC(ftrace_stub)
>   * and run return_to_handler() later on its exit.
>   */
>  ENTRY(ftrace_graph_caller)
> +#ifndef CC_USING_PATCHABLE_FUNCTION_ENTRY
>         mcount_get_lr_addr        x0    //     pointer to function's saved lr
>         mcount_get_pc             x1    //     function's pc
>         mcount_get_parent_fp      x2    //     parent's fp
>         bl      prepare_ftrace_return   // prepare_ftrace_return(&lr, pc, fp)
>
>         mcount_exit
> +#else
> +       add     x0, sp, #S_LR+16        /* address of (LR pointing into caller) */
> +       ldr     x1, [sp, #S_PC+16]
> +       ldr     x2, [sp, #S_X28 + 8+16] /* caller's frame pointer */
> +       bl      prepare_ftrace_return
> +       b       ftrace_common_return
> +#endif
>  ENDPROC(ftrace_graph_caller)
>
>  /*
> --- a/arch/arm64/kernel/ftrace.c
> +++ b/arch/arm64/kernel/ftrace.c
> @@ -65,18 +65,66 @@ int ftrace_update_ftrace_func(ftrace_fun
>         return ftrace_modify_code(pc, 0, new, false);
>  }
>
> +#ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS
> +/* Have the assembler generate a known "mov x9,x30" at compile time. */
> +static void notrace noinline __attribute__((used)) mov_x9_x30(void)
> +{
> +       asm(" .global insn_mov_x9_x30\n"
> +                    "insn_mov_x9_x30: mov x9,x30\n" : : : "x9");
> +}

You cannot rely on the compiler putting the mov at the beginning. I
think some well commented #define should do for the opcode (or did you
just remove that?)

> +#endif
> +
> +#ifdef CONFIG_ARM64_MODULE_PLTS
> +int use_ftrace_trampoline(struct module *mod, unsigned long *addr)
> +{
> +       struct plt_entry trampoline;
> +       trampoline = get_plt_entry(*addr);
> +       if (*addr == FTRACE_ADDR) {
> +               if (!plt_entries_equal(mod->arch.ftrace_trampoline,
> +                                      &trampoline)) {
> +
> +                       /* point the trampoline to our ftrace entry point */
> +                       module_disable_ro(mod);
> +                       *mod->arch.ftrace_trampoline = trampoline;
> +                       module_enable_ro(mod, true);
> +
> +                       /* update trampoline before patching in the branch */
> +                       smp_wmb();
> +               }
> +               *addr = (unsigned long)(void *)mod->arch.ftrace_trampoline;
> +       }
> +       else if (*addr == FTRACE_REGS_ADDR) {
> +               if (!plt_entries_equal(mod->arch.ftrace_regs_trampoline,
> +                                      &trampoline)) {
> +
> +                       /* point the trampoline to our ftrace entry point */
> +                       module_disable_ro(mod);
> +                       *mod->arch.ftrace_regs_trampoline = trampoline;
> +                       module_enable_ro(mod, true);
> +
> +                       /* update trampoline before patching in the branch */
> +                       smp_wmb();
> +               }
> +               *addr = (unsigned long)(void *)mod->arch.ftrace_regs_trampoline;
> +       }
> +       else
> +               return -EINVAL;
> +       return 0;
> +}
> +#endif
> +
>  /*
>   * Turn on the call to ftrace_caller() in instrumented function
>   */
>  int ftrace_make_call(struct dyn_ftrace *rec, unsigned long addr)
>  {
> -       unsigned long pc = rec->ip;
> +       unsigned long pc = rec->ip + REC_IP_BRANCH_OFFSET;
> +       int ret;
>         u32 old, new;
>         long offset = (long)pc - (long)addr;
>
>         if (offset < -SZ_128M || offset >= SZ_128M) {
>  #ifdef CONFIG_ARM64_MODULE_PLTS
> -               struct plt_entry trampoline;
>                 struct module *mod;
>
>                 /*
> @@ -96,54 +144,67 @@ int ftrace_make_call(struct dyn_ftrace *
>                 if (WARN_ON(!mod))
>                         return -EINVAL;
>
> -               /*
> -                * There is only one ftrace trampoline per module. For now,
> -                * this is not a problem since on arm64, all dynamic ftrace
> -                * invocations are routed via ftrace_caller(). This will need
> -                * to be revisited if support for multiple ftrace entry points
> -                * is added in the future, but for now, the pr_err() below
> -                * deals with a theoretical issue only.
> -                */
> -               trampoline = get_plt_entry(addr);
> -               if (!plt_entries_equal(mod->arch.ftrace_trampoline,
> -                                      &trampoline)) {
> -                       if (!plt_entries_equal(mod->arch.ftrace_trampoline,
> -                                              &(struct plt_entry){})) {
> -                               pr_err("ftrace: far branches to multiple entry points unsupported inside a single module\n");
> -                               return -EINVAL;
> -                       }
> -
> -                       /* point the trampoline to our ftrace entry point */
> -                       module_disable_ro(mod);
> -                       *mod->arch.ftrace_trampoline = trampoline;
> -                       module_enable_ro(mod, true);
> -
> -                       /* update trampoline before patching in the branch */
> -                       smp_wmb();
> +               /* Check against our well-known list of ftrace entry points */
> +               if (addr == FTRACE_ADDR || addr == FTRACE_REGS_ADDR) {
> +                       ret = use_ftrace_trampoline(mod, &addr);
> +                       if (ret < 0)
> +                               return ret;
>                 }
> -               addr = (unsigned long)(void *)mod->arch.ftrace_trampoline;
> +               else
> +                       return -EINVAL;
> +
>  #else /* CONFIG_ARM64_MODULE_PLTS */
>                 return -EINVAL;
>  #endif /* CONFIG_ARM64_MODULE_PLTS */
>         }
>
>         old = aarch64_insn_gen_nop();
> +       if (IS_ENABLED(CONFIG_DYNAMIC_FTRACE_WITH_REGS)) {
> +               new = *(u32*)&mov_x9_x30;
> +               ret = ftrace_modify_code(pc - REC_IP_BRANCH_OFFSET,
> +                                        old, new, true);
> +               if (ret)
> +                       return ret;
> +               smp_wmb(); /* ensure LR saver is in place before ftrace call */
> +       }
>         new = aarch64_insn_gen_branch_imm(pc, addr, AARCH64_INSN_BRANCH_LINK);
>
>         return ftrace_modify_code(pc, old, new, true);
>  }
>
> +int ftrace_modify_call(struct dyn_ftrace *rec, unsigned long old_addr,
> +                       unsigned long addr)
> +{
> +       unsigned long pc = rec->ip + REC_IP_BRANCH_OFFSET;
> +       u32 old, new;
> +
> +       old = aarch64_insn_gen_branch_imm(pc, old_addr, true);
> +       new = aarch64_insn_gen_branch_imm(pc, addr, true);
> +
> +       return ftrace_modify_code(pc, old, new, true);
> +}
> +
> +
>  /*
>   * Turn off the call to ftrace_caller() in instrumented function
>   */
>  int ftrace_make_nop(struct module *mod, struct dyn_ftrace *rec,
>                     unsigned long addr)
>  {
> -       unsigned long pc = rec->ip;
> +       unsigned long pc = rec->ip + REC_IP_BRANCH_OFFSET;
>         bool validate = true;
> +       int ret;
>         u32 old = 0, new;
>         long offset = (long)pc - (long)addr;
>
> +#ifdef CC_USING_PATCHABLE_FUNCTION_ENTRY
> +       /* -fpatchable-function-entry= does not generate a profiling call
> +        *  initially; the NOPs are already there.
> +        */
> +       if (addr == MCOUNT_ADDR)
> +               return 0;
> +#endif
> +
>         if (offset < -SZ_128M || offset >= SZ_128M) {
>  #ifdef CONFIG_ARM64_MODULE_PLTS
>                 u32 replaced;
> @@ -188,7 +249,16 @@ int ftrace_make_nop(struct module *mod,
>
>         new = aarch64_insn_gen_nop();
>
> -       return ftrace_modify_code(pc, old, new, validate);
> +       ret = ftrace_modify_code(pc, old, new, validate);
> +       if (ret)
> +               return ret;
> +       if (IS_ENABLED(CONFIG_DYNAMIC_FTRACE_WITH_REGS)) {
> +               smp_wmb(); /* ftrace call must not remain without LR saver. */
> +               old = *(u32*)&mov_x9_x30;
> +               ret = ftrace_modify_code(pc - REC_IP_BRANCH_OFFSET,
> +                                        old, new, true);
> +       }
> +       return ret;
>  }
>
>  void arch_ftrace_update_code(int command)
> --- a/include/asm-generic/vmlinux.lds.h
> +++ b/include/asm-generic/vmlinux.lds.h
> @@ -115,6 +115,7 @@
>  #define MCOUNT_REC()   . = ALIGN(8);                           \
>                         __start_mcount_loc = .;                 \
>                         KEEP(*(__mcount_loc))                   \
> +                       KEEP(*(__patchable_function_entries))   \
>                         __stop_mcount_loc = .;
>  #else
>  #define MCOUNT_REC()
> --- a/include/linux/compiler_types.h
> +++ b/include/linux/compiler_types.h
> @@ -61,8 +61,12 @@ extern void __chk_io_ptr(const volatile
>  #if defined(CC_USING_HOTPATCH) && !defined(__CHECKER__)
>  #define notrace __attribute__((hotpatch(0,0)))
>  #else
> +#ifdef CC_USING_PATCHABLE_FUNCTION_ENTRY
> +#define notrace __attribute__((patchable_function_entry(0)))
> +#else
>  #define notrace __attribute__((no_instrument_function))
>  #endif
> +#endif
>
>  /* Intel compiler defines __GNUC__. So we will overwrite implementations
>   * coming from above header files here
> --- a/arch/arm64/include/asm/module.h
> +++ b/arch/arm64/include/asm/module.h
> @@ -33,6 +33,7 @@ struct mod_arch_specific {
>
>         /* for CONFIG_DYNAMIC_FTRACE */
>         struct plt_entry        *ftrace_trampoline;
> +       struct plt_entry        *ftrace_regs_trampoline;
>  };
>  #endif
>
> --- a/arch/arm64/kernel/module.c
> +++ b/arch/arm64/kernel/module.c
> @@ -454,6 +454,10 @@ int module_finalize(const Elf_Ehdr *hdr,
>                 if (IS_ENABLED(CONFIG_DYNAMIC_FTRACE) &&
>                     !strcmp(".text.ftrace_trampoline", secstrs + s->sh_name))
>                         me->arch.ftrace_trampoline = (void *)s->sh_addr;
> +               if (IS_ENABLED(CONFIG_DYNAMIC_FTRACE_WITH_REGS) &&
> +                   !strcmp(".text.ftrace_regs_trampoline",
> +                               secstrs + s->sh_name))
> +                       me->arch.ftrace_regs_trampoline = (void *)s->sh_addr;
>  #endif
>         }
>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v3 2/4] arm64: implement ftrace with regs
  2018-10-01 15:57   ` Ard Biesheuvel
@ 2018-10-02 10:02     ` Torsten Duwe
  2018-10-02 10:39       ` Ard Biesheuvel
  0 siblings, 1 reply; 48+ messages in thread
From: Torsten Duwe @ 2018-10-02 10:02 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Will Deacon, Catalin Marinas, Julien Thierry, Steven Rostedt,
	Josh Poimboeuf, Ingo Molnar, Arnd Bergmann, AKASHI Takahiro,
	linux-arm-kernel, Linux Kernel Mailing List, live-patching

On Mon, Oct 01, 2018 at 05:57:52PM +0200, Ard Biesheuvel wrote:
> > --- a/arch/arm64/include/asm/ftrace.h
> > +++ b/arch/arm64/include/asm/ftrace.h
> > @@ -16,6 +16,17 @@
> >  #define MCOUNT_ADDR            ((unsigned long)_mcount)
> >  #define MCOUNT_INSN_SIZE       AARCH64_INSN_SIZE
> >
> > +/* DYNAMIC_FTRACE_WITH_REGS is implemented by adding 2 NOPs at the beginning
> > +   of each function, with the second NOP actually calling ftrace. In contrary
> > +   to a classic _mcount call, the call instruction to be modified is thus
> > +   the second one, and not the only one. */
> 
> OK, so the first slot will be patched unconditionally to do the 'mov x9, x30' ?

Right.

> > +#ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS
> > +#define ARCH_SUPPORTS_FTRACE_OPS 1
> > +#define REC_IP_BRANCH_OFFSET AARCH64_INSN_SIZE
> > +#else
> > +#define REC_IP_BRANCH_OFFSET 0
> > +#endif

The main reason for above comment was that a previous reviewer wondered
about a magic value of "4" for the REC_IP_BRANCH_OFFSET, which is actually
an insn size. The comment should leave no doubt. I'd leave the LR save
explanation elsewhere.

> >         mcount_exit
> >  ENDPROC(ftrace_caller)
> > +#else /* CC_USING_PATCHABLE_FUNCTION_ENTRY */
> > +
> > +/* Since no -pg or similar compiler flag is used, there should really be
> > +   no reference to _mcount; so do not define one. Only a function address
> > +   is needed in order to refer to it. */
> > +ENTRY(_mcount)
> > +       ret     /* just in case, prevent any fall through. */
> > +ENDPROC(_mcount)
> > +
> > +ENTRY(ftrace_regs_caller)
> > +       sub     sp, sp, #S_FRAME_SIZE
> > +       stp     x29, x9, [sp, #-16]     /* FP/LR link */
> > +
> 
> You cannot write below the stack pointer. So you are missing a
> trailing ! here. Note that you can fold the sub
> 
> stp x29, x9, [sp, #-(S_FRAME_SIZE+16)]!

Very well, but...

> > +       stp     x10, x11, [sp, #S_X10]
> > +       stp     x12, x13, [sp, #S_X12]
> > +       stp     x14, x15, [sp, #112]
> > +       stp     x16, x17, [sp, #128]
> > +       stp     x18, x19, [sp, #144]
> > +       stp     x20, x21, [sp, #160]
> > +       stp     x22, x23, [sp, #176]
> > +       stp     x24, x25, [sp, #192]
> > +       stp     x26, x27, [sp, #208]
> > +
> 
> All these will shift by 16 bytes though
> 
> I am now wondering if it wouldn't be better to create 2 stack frames:
> one for the interrupted function, and one for this function.
> 
> So something like
> 
> stp x29, x9, [sp, #-16]!
> mov x29, sp

That's about the way it was before, when you criticised it was
the wrong way ;-)

> stp x29, x30, [sp, #-(S_FRAME_SIZE + 16]!
> 
> ... store all registers including x29 ...
> 
> and do another mov x29, sp before calling into the handler. That way
> everything should be visible on the call stack when we do a backtrace.

I'm not 100% sure, but I think it already is visible correctly. Note
that the callee has in no way been called yet; control flow is
immediately diverted to the ftrace_caller.

About using SP as a pt_regs pointer: maybe I can free another register
for that purpose and thus achieve conformance *and* pretty code.

> 
> > +       b       ftrace_common
> > +ENDPROC(ftrace_regs_caller)
> > +
> > +ENTRY(ftrace_caller)
> > +       sub     sp, sp, #S_FRAME_SIZE
> > +       stp     x29, x9, [sp, #-16]     /* FP/LR link */
> > +
> 
> Same as above

Yes, Steven demanded 2 entry points :)

> >  /*
> > --- a/arch/arm64/kernel/ftrace.c
> > +++ b/arch/arm64/kernel/ftrace.c
> > @@ -65,18 +65,66 @@ int ftrace_update_ftrace_func(ftrace_fun
> >         return ftrace_modify_code(pc, 0, new, false);
> >  }
> >
> > +#ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS
> > +/* Have the assembler generate a known "mov x9,x30" at compile time. */
> > +static void notrace noinline __attribute__((used)) mov_x9_x30(void)
> > +{
> > +       asm(" .global insn_mov_x9_x30\n"
> > +                    "insn_mov_x9_x30: mov x9,x30\n" : : : "x9");
> > +}
> 
> You cannot rely on the compiler putting the mov at the beginning. I

As you can see from the asm inline, I tried the more precise assembler
label, but it didn't work out. With enough optimisation, the mov _is_
first; but you're right, it's not a good idea to rely on that.

> think some well commented #define should do for the opcode (or did you
> just remove that?)

Alas, yes I did. I had a define, then run-time generation, and now this
assembler hack. Looking at the 3, the define would be best, I'd say.

	Torsten


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v3 2/4] arm64: implement ftrace with regs
  2018-10-02 10:02     ` Torsten Duwe
@ 2018-10-02 10:39       ` Ard Biesheuvel
  0 siblings, 0 replies; 48+ messages in thread
From: Ard Biesheuvel @ 2018-10-02 10:39 UTC (permalink / raw)
  To: Torsten Duwe
  Cc: Will Deacon, Catalin Marinas, Julien Thierry, Steven Rostedt,
	Josh Poimboeuf, Ingo Molnar, Arnd Bergmann, AKASHI Takahiro,
	linux-arm-kernel, Linux Kernel Mailing List, live-patching

On 2 October 2018 at 12:02, Torsten Duwe <duwe@lst.de> wrote:
> On Mon, Oct 01, 2018 at 05:57:52PM +0200, Ard Biesheuvel wrote:
>> > --- a/arch/arm64/include/asm/ftrace.h
>> > +++ b/arch/arm64/include/asm/ftrace.h
>> > @@ -16,6 +16,17 @@
>> >  #define MCOUNT_ADDR            ((unsigned long)_mcount)
>> >  #define MCOUNT_INSN_SIZE       AARCH64_INSN_SIZE
>> >
>> > +/* DYNAMIC_FTRACE_WITH_REGS is implemented by adding 2 NOPs at the beginning
>> > +   of each function, with the second NOP actually calling ftrace. In contrary
>> > +   to a classic _mcount call, the call instruction to be modified is thus
>> > +   the second one, and not the only one. */
>>
>> OK, so the first slot will be patched unconditionally to do the 'mov x9, x30' ?
>
> Right.
>
>> > +#ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS
>> > +#define ARCH_SUPPORTS_FTRACE_OPS 1
>> > +#define REC_IP_BRANCH_OFFSET AARCH64_INSN_SIZE
>> > +#else
>> > +#define REC_IP_BRANCH_OFFSET 0
>> > +#endif
>
> The main reason for above comment was that a previous reviewer wondered
> about a magic value of "4" for the REC_IP_BRANCH_OFFSET, which is actually
> an insn size. The comment should leave no doubt. I'd leave the LR save
> explanation elsewhere.
>
>> >         mcount_exit
>> >  ENDPROC(ftrace_caller)
>> > +#else /* CC_USING_PATCHABLE_FUNCTION_ENTRY */
>> > +
>> > +/* Since no -pg or similar compiler flag is used, there should really be
>> > +   no reference to _mcount; so do not define one. Only a function address
>> > +   is needed in order to refer to it. */
>> > +ENTRY(_mcount)
>> > +       ret     /* just in case, prevent any fall through. */
>> > +ENDPROC(_mcount)
>> > +
>> > +ENTRY(ftrace_regs_caller)
>> > +       sub     sp, sp, #S_FRAME_SIZE
>> > +       stp     x29, x9, [sp, #-16]     /* FP/LR link */
>> > +
>>
>> You cannot write below the stack pointer. So you are missing a
>> trailing ! here. Note that you can fold the sub
>>
>> stp x29, x9, [sp, #-(S_FRAME_SIZE+16)]!
>
> Very well, but...
>
>> > +       stp     x10, x11, [sp, #S_X10]
>> > +       stp     x12, x13, [sp, #S_X12]
>> > +       stp     x14, x15, [sp, #112]
>> > +       stp     x16, x17, [sp, #128]
>> > +       stp     x18, x19, [sp, #144]
>> > +       stp     x20, x21, [sp, #160]
>> > +       stp     x22, x23, [sp, #176]
>> > +       stp     x24, x25, [sp, #192]
>> > +       stp     x26, x27, [sp, #208]
>> > +
>>
>> All these will shift by 16 bytes though
>>
>> I am now wondering if it wouldn't be better to create 2 stack frames:
>> one for the interrupted function, and one for this function.
>>
>> So something like
>>
>> stp x29, x9, [sp, #-16]!
>> mov x29, sp
>
> That's about the way it was before, when you criticised it was
> the wrong way ;-)
>

Really? With two stack frames?

In any case, the important thing is that you call into the next
function with fp/lr on the top of the stack, and fp pointing to the
next fp/lr pair.

I think it would be an improvement to add the second fp/lr for the
interrupted function first, or the *caller* of that function will not
be visible from the state of the stack (since x9 is the only register
that points into that function)

So in summary:

ftraced_function():
  mov x9, x30
  bl  ftrace_regs_caller

ftrace_regs_caller():
  stp  x29, x9, [sp, #-16]!
  mov x29, sp

* At this point, we have a fp/lr pair on the top of the stack that
links the call to ftraced_function() into its caller.

stp x29, x30, [sp, #-(S_FRAME_SIZE+16)]!

(note the x30 instead of x9 - my mistake)

* At this point we have a fp/lr pair on the top of the stack that
links the call to ftrace_regs_caller() into ftraced_function()

You can now populate the pt_regs structure with the various register value.

mov x29, sp

* Now fp points to the top of the stack, where a fp/lr pair lives, so
you can proceed to call other functions.

I hope this helps.





>> stp x29, x30, [sp, #-(S_FRAME_SIZE + 16]!
>>
>> ... store all registers including x29 ...
>>
>> and do another mov x29, sp before calling into the handler. That way
>> everything should be visible on the call stack when we do a backtrace.
>
> I'm not 100% sure, but I think it already is visible correctly. Note
> that the callee has in no way been called yet; control flow is
> immediately diverted to the ftrace_caller.
>

Yes but the link register lives in x9 so there is no way the normal
backtrace logic can see where the ftraced_function() has been called
from.

> About using SP as a pt_regs pointer: maybe I can free another register
> for that purpose and thus achieve conformance *and* pretty code.
>

Sure.

>>
>> > +       b       ftrace_common
>> > +ENDPROC(ftrace_regs_caller)
>> > +
>> > +ENTRY(ftrace_caller)
>> > +       sub     sp, sp, #S_FRAME_SIZE
>> > +       stp     x29, x9, [sp, #-16]     /* FP/LR link */
>> > +
>>
>> Same as above
>
> Yes, Steven demanded 2 entry points :)
>
>> >  /*
>> > --- a/arch/arm64/kernel/ftrace.c
>> > +++ b/arch/arm64/kernel/ftrace.c
>> > @@ -65,18 +65,66 @@ int ftrace_update_ftrace_func(ftrace_fun
>> >         return ftrace_modify_code(pc, 0, new, false);
>> >  }
>> >
>> > +#ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS
>> > +/* Have the assembler generate a known "mov x9,x30" at compile time. */
>> > +static void notrace noinline __attribute__((used)) mov_x9_x30(void)
>> > +{
>> > +       asm(" .global insn_mov_x9_x30\n"
>> > +                    "insn_mov_x9_x30: mov x9,x30\n" : : : "x9");
>> > +}
>>
>> You cannot rely on the compiler putting the mov at the beginning. I
>
> As you can see from the asm inline, I tried the more precise assembler
> label, but it didn't work out. With enough optimisation, the mov _is_
> first; but you're right, it's not a good idea to rely on that.
>

Ah right, I missed that. Still pretty nasty though :-)

>> think some well commented #define should do for the opcode (or did you
>> just remove that?)
>
> Alas, yes I did. I had a define, then run-time generation, and now this
> assembler hack. Looking at the 3, the define would be best, I'd say.
>

I tend to agree with that.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v3 2/4] arm64: implement ftrace with regs
  2018-10-01 14:16 ` [PATCH v3 2/4] arm64: implement ftrace with regs Torsten Duwe
  2018-10-01 15:57   ` Ard Biesheuvel
@ 2018-10-02 11:27   ` Mark Rutland
  2018-10-02 12:18     ` Torsten Duwe
  1 sibling, 1 reply; 48+ messages in thread
From: Mark Rutland @ 2018-10-02 11:27 UTC (permalink / raw)
  To: Torsten Duwe
  Cc: Will Deacon, Catalin Marinas, Julien Thierry, Steven Rostedt,
	Josh Poimboeuf, Ingo Molnar, Ard Biesheuvel, Arnd Bergmann,
	AKASHI Takahiro, linux-arm-kernel, linux-kernel, live-patching

On Mon, Oct 01, 2018 at 04:16:48PM +0200, Torsten Duwe wrote:
> Check for compiler support of -fpatchable-function-entry and use it
> to intercept functions immediately on entry, saving the LR in x9.
> patchable-function-entry in GCC disables IPA-RA, which means ABI
> register calling conventions are obeyed *and* scratch registers are
> available.
> Disable ftracing in efi/libstub, because this triggers cross-section
> linker errors now (-pg is disabled already for those files).
> Add an ftrace_caller which can handle LR in x9, as well as an
> ftrace_regs_caller that additionally writes out a set of pt_regs
> for inspection.
> Introduce and handle an ftrace_regs_trampoline for module PLTs.

This reads like a changelog rather than a commit message, which makes
review far more painful than necessary.

It would be very helpful if you could rewrite this commit message to
explain *what* you are trying to achieve, the caveats, then if
necessary, any details worth noting.

For example, could you please add an explanation of why the magic with
x9 is necessary? I understand that you need to preserve the original
x30, and that rationale should be in the commit message.

> Signed-off-by: Torsten Duwe <duwe@suse.de>
> 
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -110,6 +110,7 @@ config ARM64
>  	select HAVE_DEBUG_KMEMLEAK
>  	select HAVE_DMA_CONTIGUOUS
>  	select HAVE_DYNAMIC_FTRACE
> +	select HAVE_DYNAMIC_FTRACE_WITH_REGS
>  	select HAVE_EFFICIENT_UNALIGNED_ACCESS
>  	select HAVE_FTRACE_MCOUNT_RECORD
>  	select HAVE_FUNCTION_TRACER
> --- a/arch/arm64/Makefile
> +++ b/arch/arm64/Makefile
> @@ -78,6 +78,15 @@ ifeq ($(CONFIG_ARM64_MODULE_PLTS),y)
>  KBUILD_LDFLAGS_MODULE	+= -T $(srctree)/arch/arm64/kernel/module.lds
>  endif
>  
> +ifeq ($(CONFIG_DYNAMIC_FTRACE_WITH_REGS),y)
> +  CC_FLAGS_FTRACE := -fpatchable-function-entry=2
> +  KBUILD_CPPFLAGS += -DCC_USING_PATCHABLE_FUNCTION_ENTRY
> +  ifeq ($(call cc-option,-fpatchable-function-entry=2),)
> +    $(error Cannot use CONFIG_DYNAMIC_FTRACE_WITH_REGS: \
> +             -fpatchable-function-entry not supported by compiler)
> +  endif
> +endif
> +
>  # Default value
>  head-y		:= arch/arm64/kernel/head.o
>  
> --- a/arch/arm64/include/asm/ftrace.h
> +++ b/arch/arm64/include/asm/ftrace.h
> @@ -16,6 +16,17 @@
>  #define MCOUNT_ADDR		((unsigned long)_mcount)
>  #define MCOUNT_INSN_SIZE	AARCH64_INSN_SIZE
>  
> +/* DYNAMIC_FTRACE_WITH_REGS is implemented by adding 2 NOPs at the beginning
> +   of each function, with the second NOP actually calling ftrace. In contrary
> +   to a classic _mcount call, the call instruction to be modified is thus
> +   the second one, and not the only one. */

Bogus comment style. Please follow the usual style:

/*
 * Like this.
 */

> +#ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS
> +#define ARCH_SUPPORTS_FTRACE_OPS 1
> +#define REC_IP_BRANCH_OFFSET AARCH64_INSN_SIZE
> +#else
> +#define REC_IP_BRANCH_OFFSET 0
> +#endif
> +
>  #ifndef __ASSEMBLY__
>  #include <linux/compat.h>
>  
> --- a/arch/arm64/kernel/Makefile
> +++ b/arch/arm64/kernel/Makefile
> @@ -7,9 +7,9 @@ CPPFLAGS_vmlinux.lds	:= -DTEXT_OFFSET=$(
>  AFLAGS_head.o		:= -DTEXT_OFFSET=$(TEXT_OFFSET)
>  CFLAGS_armv8_deprecated.o := -I$(src)
>  
> -CFLAGS_REMOVE_ftrace.o = -pg
> -CFLAGS_REMOVE_insn.o = -pg
> -CFLAGS_REMOVE_return_address.o = -pg
> +CFLAGS_REMOVE_ftrace.o = -pg $(CC_FLAGS_FTRACE)
> +CFLAGS_REMOVE_insn.o = -pg $(CC_FLAGS_FTRACE)
> +CFLAGS_REMOVE_return_address.o = -pg $(CC_FLAGS_FTRACE)
>  
>  # Object file lists.
>  arm64-obj-y		:= debug-monitors.o entry.o irq.o fpsimd.o		\
> --- a/drivers/firmware/efi/libstub/Makefile
> +++ b/drivers/firmware/efi/libstub/Makefile
> @@ -11,7 +11,8 @@ cflags-$(CONFIG_X86)		+= -m$(BITS) -D__K
>  				   -fPIC -fno-strict-aliasing -mno-red-zone \
>  				   -mno-mmx -mno-sse -fshort-wchar
>  
> -cflags-$(CONFIG_ARM64)		:= $(subst -pg,,$(KBUILD_CFLAGS)) -fpie
> +cflags-$(CONFIG_ARM64)		:= $(filter-out -pg $(CC_FLAGS_FTRACE)\
> +				  ,$(KBUILD_CFLAGS)) -fpie
>  cflags-$(CONFIG_ARM)		:= $(subst -pg,,$(KBUILD_CFLAGS)) \
>  				   -fno-builtin -fpic -mno-single-pic-base
>  
> --- a/arch/arm64/kernel/entry-ftrace.S
> +++ b/arch/arm64/kernel/entry-ftrace.S
> @@ -13,6 +13,8 @@
>  #include <asm/assembler.h>
>  #include <asm/ftrace.h>
>  #include <asm/insn.h>
> +#include <asm/asm-offsets.h>
> +#include <asm/assembler.h>
>  
>  /*
>   * Gcc with -pg will put the following code in the beginning of each function:
> @@ -123,6 +125,7 @@ skip_ftrace_call:			// }
>  ENDPROC(_mcount)
>  
>  #else /* CONFIG_DYNAMIC_FTRACE */
> +#ifndef CC_USING_PATCHABLE_FUNCTION_ENTRY
>  /*
>   * _mcount() is used to build the kernel with -pg option, but all the branch
>   * instructions to _mcount() are replaced to NOP initially at kernel start up,
> @@ -162,6 +165,92 @@ ftrace_graph_call:			// ftrace_graph_cal
>  
>  	mcount_exit
>  ENDPROC(ftrace_caller)
> +#else /* CC_USING_PATCHABLE_FUNCTION_ENTRY */
> +
> +/* Since no -pg or similar compiler flag is used, there should really be
> +   no reference to _mcount; so do not define one. Only a function address
> +   is needed in order to refer to it. */

Bogus comment style again.

The comment says that there shouldn't be any reference to _mcount, then
that we should define it in order to refer to it, which doesn't make any
sense to me.

If there shouldn't be any references to it, there's no reason for it to
exist at all. Where is it being referred to, and why?

> +ENTRY(_mcount)
> +	ret	/* just in case, prevent any fall through. */
> +ENDPROC(_mcount)
> +
> +ENTRY(ftrace_regs_caller)
> +	sub	sp, sp, #S_FRAME_SIZE
> +	stp	x29, x9, [sp, #-16]	/* FP/LR link */

It does not make sense to me to store this *below* the SP. We already
have pt_regs::stackframe, so please use that.

> +	stp	x10, x11, [sp, #S_X10]
> +	stp	x12, x13, [sp, #S_X12]
> +	stp	x14, x15, [sp, #112]
> +	stp	x16, x17, [sp, #128]
> +	stp	x18, x19, [sp, #144]
> +	stp	x20, x21, [sp, #160]
> +	stp	x22, x23, [sp, #176]
> +	stp	x24, x25, [sp, #192]
> +	stp	x26, x27, [sp, #208]

Please use the S_X14...S_X26 definitions, following on from S_X10 and
S_X12.

> +
> +	b	ftrace_common
> +ENDPROC(ftrace_regs_caller)
> +
> +ENTRY(ftrace_caller)
> +	sub	sp, sp, #S_FRAME_SIZE
> +	stp	x29, x9, [sp, #-16]	/* FP/LR link */

As with ftrace_regs_caller, storing below the SP is not sound. Please
use pt_regs::stackframe.

> +
> +ftrace_common:
> +	stp	x28, x29, [sp, #224]	/* FP in pt_regs + "our" x28 */
> +
> +	/* save function arguments */
> +	stp	x0, x1, [sp]
> +	stp	x2, x3, [sp, #16]
> +	stp	x4, x5, [sp, #32]
> +	stp	x6, x7, [sp, #48]
> +	stp	x8, x9, [sp, #64]

Please use S_X0...S_X8 here.

> +	/* The link Register at callee entry */
> +	str	x9, [sp, #S_LR]		/* to pt_regs.r[30] */
> +	/* The program counter just after the ftrace call site */
> +	str	lr, [sp, #S_PC]
> +	/* The stack pointer as it was on ftrace_caller entry... */
> +	add	x29, sp, #S_FRAME_SIZE
> +	str	x29, [sp, #S_SP]

Corrupting the link register in this way does not seem safe to me.

> +
> +	ldr_l	x2, function_trace_op, x0
> +	mov	x1, x9		/* saved LR == parent IP */
> +	sub	x0, lr, #8	/* function entry == IP */
> +	mov	x3, sp		/* pt_regs are @sp */
> +	sub	sp, sp, #16	/* skip over FP/LR link */

Please use the embedded pt_regs::stackframe; that way you can avoid
having to play with the SP so much.

> +
> +	.global ftrace_call
> +ftrace_call:
> +
> +	bl	ftrace_stub
> +
> +#ifdef CONFIG_FUNCTION_GRAPH_TRACER
> +	.global ftrace_graph_call
> +ftrace_graph_call:			// ftrace_graph_caller();
> +	nop				// If enabled, this will be replaced
> +					// "b ftrace_graph_caller"
> +#endif
> +
> +ftrace_common_return:
> +	add	sp, sp, #16	/* advance to pt_regs for restore */

With the embedded pt_regs::stackframe, this isn't necessary.

> +
> +	ldp	x0, x1, [sp]
> +	ldp	x2, x3, [sp, #16]
> +	ldp	x4, x5, [sp, #32]
> +	ldp	x6, x7, [sp, #48]
> +	ldp	x8, x9, [sp, #64]
> +
> +	ldp	x28, x29, [sp, #224]

Please use the S_X<n> definitions for these offsets.

> +
> +	ldr	x9, [sp, #S_PC]
> +	ldr	lr, [sp, #S_LR]
> +	add	sp, sp, #S_FRAME_SIZE
> +
> +	ret	x9
> +
> +ENDPROC(ftrace_caller)
> +
> +#endif /* CC_USING_PATCHABLE_FUNCTION_ENTRY */
>  #endif /* CONFIG_DYNAMIC_FTRACE */
>  
>  ENTRY(ftrace_stub)
> @@ -197,12 +286,20 @@ ENDPROC(ftrace_stub)
>   * and run return_to_handler() later on its exit.
>   */
>  ENTRY(ftrace_graph_caller)
> +#ifndef CC_USING_PATCHABLE_FUNCTION_ENTRY
>  	mcount_get_lr_addr	  x0	//     pointer to function's saved lr
>  	mcount_get_pc		  x1	//     function's pc
>  	mcount_get_parent_fp	  x2	//     parent's fp
>  	bl	prepare_ftrace_return	// prepare_ftrace_return(&lr, pc, fp)
>  
>  	mcount_exit
> +#else
> +	add	x0, sp, #S_LR+16	/* address of (LR pointing into caller) */

This comment didn't help. What exactly are you trying to access here?

Please put a space around binary operators, i.e. this should be:
	
	add	x0, sp, #S_LR + 16

... or with brackets to avoid any ambiguous readings:

	add	x0, sp, #(S_LR + 16)

> +	ldr	x1, [sp, #S_PC+16]
> +	ldr	x2, [sp, #S_X28 + 8+16]	/* caller's frame pointer */

What exactly are you trying to access here?

Please use an S_<reg> definition. If one doesn't exist, add one to
asm-offsets.

> +	bl	prepare_ftrace_return
> +	b	ftrace_common_return
> +#endif
>  ENDPROC(ftrace_graph_caller)
>  
>  /*
> --- a/arch/arm64/kernel/ftrace.c
> +++ b/arch/arm64/kernel/ftrace.c
> @@ -65,18 +65,66 @@ int ftrace_update_ftrace_func(ftrace_fun
>  	return ftrace_modify_code(pc, 0, new, false);
>  }
>  
> +#ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS
> +/* Have the assembler generate a known "mov x9,x30" at compile time. */
> +static void notrace noinline __attribute__((used)) mov_x9_x30(void)
> +{
> +	asm(" .global insn_mov_x9_x30\n"
> +		     "insn_mov_x9_x30: mov x9,x30\n" : : : "x9");
> +}
> +#endif

This is fragile and more complicated than necessary.

Please use the insn framework, as we do to generate all the other
instruction sequences in ftrace.

MOV (register) is an alias of ORR (shifted register), i.e.

	mov	<xd>, <xm>

... is:

	orr	<xd>, xzr, <xm>

... and we have code to generate ORR, so we can add a trivial wrapper to
generate MOV.

> +
> +#ifdef CONFIG_ARM64_MODULE_PLTS
> +int use_ftrace_trampoline(struct module *mod, unsigned long *addr)
> +{
> +	struct plt_entry trampoline;
> +	trampoline = get_plt_entry(*addr);
> +	if (*addr == FTRACE_ADDR) {
> +		if (!plt_entries_equal(mod->arch.ftrace_trampoline,
> +				       &trampoline)) {
> +
> +			/* point the trampoline to our ftrace entry point */
> +			module_disable_ro(mod);
> +			*mod->arch.ftrace_trampoline = trampoline;
> +			module_enable_ro(mod, true);
> +
> +			/* update trampoline before patching in the branch */
> +			smp_wmb();
> +		}
> +		*addr = (unsigned long)(void *)mod->arch.ftrace_trampoline;
> +	}
> +	else if (*addr == FTRACE_REGS_ADDR) {
> +		if (!plt_entries_equal(mod->arch.ftrace_regs_trampoline,
> +				       &trampoline)) {
> +
> +			/* point the trampoline to our ftrace entry point */
> +			module_disable_ro(mod);
> +			*mod->arch.ftrace_regs_trampoline = trampoline;
> +			module_enable_ro(mod, true);
> +
> +			/* update trampoline before patching in the branch */
> +			smp_wmb();
> +		}
> +		*addr = (unsigned long)(void *)mod->arch.ftrace_regs_trampoline;
> +	}
> +	else
> +		return -EINVAL;
> +	return 0;
> +}
> +#endif

I think you can simplify this significantly by making the module
trampoline address a variable:

int use_ftrace_trampoline(struct module *mod, unsigned long *addr)
{
	struct plt_entry trampoline = get_plt_entry(*addr);
	struct plt_entry *mod_trampoline;

	if (*addr == FTRACE_ADDR)
		mod_trampoline = mod->arch.ftrace_trampoline;
	else if (*addr == FTRACE_REGS_ADDR)
		mod_trampoline = mod->arch.ftrace_regs_trampoline;
	else
		return -EINVAL;

	if (!plt_entries_equal(mode_trampoline, &trampoline)) {

		/* point the trampoline to our ftrace entry point */
		module_disable_ro(mod);
		*mod_trampoline = trampoline;
		module_enable_ro(mod, true);

		/* update trampoline before patching in the branch */
		smp_wmb();
	}

	*addr = (unsigned long)mod_trampoline;

	return 0;
}

... also, this would be better names as something like
install_ftrace_trampoline.

> +
>  /*
>   * Turn on the call to ftrace_caller() in instrumented function
>   */
>  int ftrace_make_call(struct dyn_ftrace *rec, unsigned long addr)
>  {
> -	unsigned long pc = rec->ip;
> +	unsigned long pc = rec->ip + REC_IP_BRANCH_OFFSET;
> +	int ret;
>  	u32 old, new;
>  	long offset = (long)pc - (long)addr;
>  
>  	if (offset < -SZ_128M || offset >= SZ_128M) {
>  #ifdef CONFIG_ARM64_MODULE_PLTS
> -		struct plt_entry trampoline;
>  		struct module *mod;
>  
>  		/*
> @@ -96,54 +144,67 @@ int ftrace_make_call(struct dyn_ftrace *
>  		if (WARN_ON(!mod))
>  			return -EINVAL;
>  
> -		/*
> -		 * There is only one ftrace trampoline per module. For now,
> -		 * this is not a problem since on arm64, all dynamic ftrace
> -		 * invocations are routed via ftrace_caller(). This will need
> -		 * to be revisited if support for multiple ftrace entry points
> -		 * is added in the future, but for now, the pr_err() below
> -		 * deals with a theoretical issue only.
> -		 */
> -		trampoline = get_plt_entry(addr);
> -		if (!plt_entries_equal(mod->arch.ftrace_trampoline,
> -				       &trampoline)) {
> -			if (!plt_entries_equal(mod->arch.ftrace_trampoline,
> -					       &(struct plt_entry){})) {
> -				pr_err("ftrace: far branches to multiple entry points unsupported inside a single module\n");
> -				return -EINVAL;
> -			}
> -
> -			/* point the trampoline to our ftrace entry point */
> -			module_disable_ro(mod);
> -			*mod->arch.ftrace_trampoline = trampoline;
> -			module_enable_ro(mod, true);
> -
> -			/* update trampoline before patching in the branch */
> -			smp_wmb();
> +		/* Check against our well-known list of ftrace entry points */
> +		if (addr == FTRACE_ADDR || addr == FTRACE_REGS_ADDR) {
> +			ret = use_ftrace_trampoline(mod, &addr);
> +			if (ret < 0)
> +				return ret;
>  		}
> -		addr = (unsigned long)(void *)mod->arch.ftrace_trampoline;
> +		else
> +			return -EINVAL;
> +
>  #else /* CONFIG_ARM64_MODULE_PLTS */
>  		return -EINVAL;
>  #endif /* CONFIG_ARM64_MODULE_PLTS */
>  	}
>  
>  	old = aarch64_insn_gen_nop();
> +	if (IS_ENABLED(CONFIG_DYNAMIC_FTRACE_WITH_REGS)) {
> +		new = *(u32*)&mov_x9_x30;

As before, please use the insn framework for this.

> +		ret = ftrace_modify_code(pc - REC_IP_BRANCH_OFFSET,
> +					 old, new, true);
> +		if (ret)
> +			return ret;
> +		smp_wmb(); /* ensure LR saver is in place before ftrace call */

Since ftrace_modify_code() does cache maintenance, completed with a DSB
ISH, the smp_wmb() isn't necessary -- the order of updates is already
ensured.

> +	}
>  	new = aarch64_insn_gen_branch_imm(pc, addr, AARCH64_INSN_BRANCH_LINK);
>  
>  	return ftrace_modify_code(pc, old, new, true);
>  }
>  
> +int ftrace_modify_call(struct dyn_ftrace *rec, unsigned long old_addr,
> +			unsigned long addr)
> +{
> +	unsigned long pc = rec->ip + REC_IP_BRANCH_OFFSET;
> +	u32 old, new;
> +
> +	old = aarch64_insn_gen_branch_imm(pc, old_addr, true);
> +	new = aarch64_insn_gen_branch_imm(pc, addr, true);
> +
> +	return ftrace_modify_code(pc, old, new, true);
> +}
> +
> +
>  /*
>   * Turn off the call to ftrace_caller() in instrumented function
>   */
>  int ftrace_make_nop(struct module *mod, struct dyn_ftrace *rec,
>  		    unsigned long addr)
>  {
> -	unsigned long pc = rec->ip;
> +	unsigned long pc = rec->ip + REC_IP_BRANCH_OFFSET;
>  	bool validate = true;
> +	int ret;
>  	u32 old = 0, new;
>  	long offset = (long)pc - (long)addr;
>  
> +#ifdef CC_USING_PATCHABLE_FUNCTION_ENTRY
> +	/* -fpatchable-function-entry= does not generate a profiling call
> +	 *  initially; the NOPs are already there.
> +	 */
> +	if (addr == MCOUNT_ADDR)
> +		return 0;
> +#endif

Bogus comment style.

> +
>  	if (offset < -SZ_128M || offset >= SZ_128M) {
>  #ifdef CONFIG_ARM64_MODULE_PLTS
>  		u32 replaced;
> @@ -188,7 +249,16 @@ int ftrace_make_nop(struct module *mod,
>  
>  	new = aarch64_insn_gen_nop();
>  
> -	return ftrace_modify_code(pc, old, new, validate);
> +	ret = ftrace_modify_code(pc, old, new, validate);
> +	if (ret)
> +		return ret;
> +	if (IS_ENABLED(CONFIG_DYNAMIC_FTRACE_WITH_REGS)) {
> +		smp_wmb(); /* ftrace call must not remain without LR saver. */

Unnecessary barrier.

Thanks,
Mark.

> +		old = *(u32*)&mov_x9_x30;
> +		ret = ftrace_modify_code(pc - REC_IP_BRANCH_OFFSET,
> +					 old, new, true);
> +	}
> +	return ret;
>  }
>  
>  void arch_ftrace_update_code(int command)
> --- a/include/asm-generic/vmlinux.lds.h
> +++ b/include/asm-generic/vmlinux.lds.h
> @@ -115,6 +115,7 @@
>  #define MCOUNT_REC()	. = ALIGN(8);				\
>  			__start_mcount_loc = .;			\
>  			KEEP(*(__mcount_loc))			\
> +			KEEP(*(__patchable_function_entries))	\
>  			__stop_mcount_loc = .;
>  #else
>  #define MCOUNT_REC()
> --- a/include/linux/compiler_types.h
> +++ b/include/linux/compiler_types.h
> @@ -61,8 +61,12 @@ extern void __chk_io_ptr(const volatile
>  #if defined(CC_USING_HOTPATCH) && !defined(__CHECKER__)
>  #define notrace __attribute__((hotpatch(0,0)))
>  #else
> +#ifdef CC_USING_PATCHABLE_FUNCTION_ENTRY
> +#define notrace __attribute__((patchable_function_entry(0)))
> +#else
>  #define notrace __attribute__((no_instrument_function))
>  #endif
> +#endif
>  
>  /* Intel compiler defines __GNUC__. So we will overwrite implementations
>   * coming from above header files here
> --- a/arch/arm64/include/asm/module.h
> +++ b/arch/arm64/include/asm/module.h
> @@ -33,6 +33,7 @@ struct mod_arch_specific {
>  
>  	/* for CONFIG_DYNAMIC_FTRACE */
>  	struct plt_entry 	*ftrace_trampoline;
> +	struct plt_entry 	*ftrace_regs_trampoline;
>  };
>  #endif
>  
> --- a/arch/arm64/kernel/module.c
> +++ b/arch/arm64/kernel/module.c
> @@ -454,6 +454,10 @@ int module_finalize(const Elf_Ehdr *hdr,
>  		if (IS_ENABLED(CONFIG_DYNAMIC_FTRACE) &&
>  		    !strcmp(".text.ftrace_trampoline", secstrs + s->sh_name))
>  			me->arch.ftrace_trampoline = (void *)s->sh_addr;
> +		if (IS_ENABLED(CONFIG_DYNAMIC_FTRACE_WITH_REGS) &&
> +		    !strcmp(".text.ftrace_regs_trampoline",
> +				secstrs + s->sh_name))
> +			me->arch.ftrace_regs_trampoline = (void *)s->sh_addr;
>  #endif
>  	}
>  

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v3 2/4] arm64: implement ftrace with regs
  2018-10-02 11:27   ` Mark Rutland
@ 2018-10-02 12:18     ` Torsten Duwe
  2018-10-02 12:57       ` Mark Rutland
  0 siblings, 1 reply; 48+ messages in thread
From: Torsten Duwe @ 2018-10-02 12:18 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Will Deacon, Catalin Marinas, Julien Thierry, Steven Rostedt,
	Josh Poimboeuf, Ingo Molnar, Ard Biesheuvel, Arnd Bergmann,
	AKASHI Takahiro, linux-arm-kernel, linux-kernel, live-patching

Hi Mark,

thank you for your very detailed feedback, I'll incorporate it
all into the next version, besides one issue:

On Tue, Oct 02, 2018 at 12:27:41PM +0100, Mark Rutland wrote:
> 
> Please use the insn framework, as we do to generate all the other
> instruction sequences in ftrace.
> 
> MOV (register) is an alias of ORR (shifted register), i.e.
> 
> 	mov	<xd>, <xm>
> 
> ... is:
> 
> 	orr	<xd>, xzr, <xm>
> 
> ... and we have code to generate ORR, so we can add a trivial wrapper to
> generate MOV.

I had something similar in v2; but it was hardly any better to read or
understand. My main question however is: how do you justify the runtime
overhead of aarch64_insn_gen_logical_shifted_reg for every function that
gets its tracing switched on or off? The result is always the same 4-byte
constant, so why not use a macro and a comment that says what it does?

	Torsten


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v3 2/4] arm64: implement ftrace with regs
  2018-10-02 12:18     ` Torsten Duwe
@ 2018-10-02 12:57       ` Mark Rutland
  0 siblings, 0 replies; 48+ messages in thread
From: Mark Rutland @ 2018-10-02 12:57 UTC (permalink / raw)
  To: Torsten Duwe
  Cc: Will Deacon, Catalin Marinas, Julien Thierry, Steven Rostedt,
	Josh Poimboeuf, Ingo Molnar, Ard Biesheuvel, Arnd Bergmann,
	AKASHI Takahiro, linux-arm-kernel, linux-kernel, live-patching

On Tue, Oct 02, 2018 at 02:18:17PM +0200, Torsten Duwe wrote:
> Hi Mark,

Hi,
 
> thank you for your very detailed feedback, I'll incorporate it
> all into the next version, besides one issue:
> 
> On Tue, Oct 02, 2018 at 12:27:41PM +0100, Mark Rutland wrote:
> > 
> > Please use the insn framework, as we do to generate all the other
> > instruction sequences in ftrace.
> > 
> > MOV (register) is an alias of ORR (shifted register), i.e.
> > 
> > 	mov	<xd>, <xm>
> > 
> > ... is:
> > 
> > 	orr	<xd>, xzr, <xm>
> > 
> > ... and we have code to generate ORR, so we can add a trivial wrapper to
> > generate MOV.
> 
> I had something similar in v2; but it was hardly any better to read or
> understand. My main question however is: how do you justify the runtime
> overhead of aarch64_insn_gen_logical_shifted_reg for every function that
> gets its tracing switched on or off?

How do you justify doing something different from the established
pattern? Do you have any numbers indicating that this overhead is a
problem on a real workload?

For the moment at least, please use aarch64_insn_gen_*(), as we do for
all other instructions generated in the ftrace code. It's vastly simpler
for everyone if we have consistency here.

> The result is always the same 4-byte constant, so why not use a macro
> and a comment that says what it does?

I'd rather that we stick to the usual pattern that we have in arm64.

Note that aarch64_insn_gen_nop() also always returns the same 4-byte
constant, but it's an out-of-line function in insn.c. There haven't been
any complaints as to its overhead so far...

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v3 3/4] arm64: implement live patching
  2018-10-01 14:16 ` [PATCH v3 3/4] arm64: implement live patching Torsten Duwe
@ 2018-10-17 13:39   ` Miroslav Benes
  2018-10-18 12:58     ` Jessica Yu
  2018-10-23 17:55   ` [PATCH] arm64/module: use mod->klp_info section header information Jessica Yu
  2018-11-05 17:57   ` [PATCH] arm64/module: use plt section indices for relocations Jessica Yu
  2 siblings, 1 reply; 48+ messages in thread
From: Miroslav Benes @ 2018-10-17 13:39 UTC (permalink / raw)
  To: Torsten Duwe
  Cc: Will Deacon, Catalin Marinas, Julien Thierry, Steven Rostedt,
	Josh Poimboeuf, Ingo Molnar, Ard Biesheuvel, Arnd Bergmann,
	AKASHI Takahiro, linux-arm-kernel, linux-kernel, live-patching,
	jeyu

On Mon, 1 Oct 2018, Torsten Duwe wrote:

> Based on ftrace with regs, do the usual thing. Also allocate a
> task flag for whatever consistency handling will be used.
> Watch out for interactions with the graph tracer.

Similar to what Mark wrote about 2/4, I'd appreciate a better commit log. 
Could you explain traditional "what/why/how", please?
 
> Signed-off-by: Torsten Duwe <duwe@suse.de>
> 
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -119,6 +119,7 @@ config ARM64
>  	select HAVE_GENERIC_DMA_COHERENT
>  	select HAVE_HW_BREAKPOINT if PERF_EVENTS
>  	select HAVE_IRQ_TIME_ACCOUNTING
> +	select HAVE_LIVEPATCH
>  	select HAVE_MEMBLOCK
>  	select HAVE_MEMBLOCK_NODE_MAP if NUMA
>  	select HAVE_NMI
> @@ -1349,4 +1350,6 @@ if CRYPTO
>  source "arch/arm64/crypto/Kconfig"
>  endif
>  
> +source "kernel/livepatch/Kconfig"
> +
>  source "lib/Kconfig"
> --- a/arch/arm64/include/asm/thread_info.h
> +++ b/arch/arm64/include/asm/thread_info.h
> @@ -76,6 +76,7 @@ void arch_release_task_struct(struct tas
>  #define TIF_FOREIGN_FPSTATE	3	/* CPU's FP state is not current's */
>  #define TIF_UPROBE		4	/* uprobe breakpoint or singlestep */
>  #define TIF_FSCHECK		5	/* Check FS is USER_DS on return */
> +#define TIF_PATCH_PENDING	6
>  #define TIF_NOHZ		7
>  #define TIF_SYSCALL_TRACE	8
>  #define TIF_SYSCALL_AUDIT	9
> @@ -94,6 +95,7 @@ void arch_release_task_struct(struct tas
>  #define _TIF_NEED_RESCHED	(1 << TIF_NEED_RESCHED)
>  #define _TIF_NOTIFY_RESUME	(1 << TIF_NOTIFY_RESUME)
>  #define _TIF_FOREIGN_FPSTATE	(1 << TIF_FOREIGN_FPSTATE)
> +#define _TIF_PATCH_PENDING	(1 << TIF_PATCH_PENDING)
>  #define _TIF_NOHZ		(1 << TIF_NOHZ)
>  #define _TIF_SYSCALL_TRACE	(1 << TIF_SYSCALL_TRACE)
>  #define _TIF_SYSCALL_AUDIT	(1 << TIF_SYSCALL_AUDIT)
> @@ -106,7 +108,8 @@ void arch_release_task_struct(struct tas
>  
>  #define _TIF_WORK_MASK		(_TIF_NEED_RESCHED | _TIF_SIGPENDING | \
>  				 _TIF_NOTIFY_RESUME | _TIF_FOREIGN_FPSTATE | \
> -				 _TIF_UPROBE | _TIF_FSCHECK)
> +				 _TIF_UPROBE | _TIF_FSCHECK | \
> +				 _TIF_PATCH_PENDING)

Could you add a note to the changelog what this means? My ability to read 
arm64 entry.S is very limited, but I can see that _TIF_WORK_MASK is 
process in a syscall exit and irq return paths. That's good. It is also 
called (via "b ret_to_user") in a couple of different places (el0_* 
labels). I guess those are returns from exception handling. A comment 
about it in the changelog would be appreciated.
  
>  #define _TIF_SYSCALL_WORK	(_TIF_SYSCALL_TRACE | _TIF_SYSCALL_AUDIT | \
>  				 _TIF_SYSCALL_TRACEPOINT | _TIF_SECCOMP | \
> --- /dev/null
> +++ b/arch/arm64/include/asm/livepatch.h
> @@ -0,0 +1,36 @@
> +/* SPDX-License-Identifier: GPL-2.0
> + *
> + * livepatch.h - arm64-specific Kernel Live Patching Core
> + *
> + * Copyright (C) 2016,2018 SUSE
> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU General Public License
> + * as published by the Free Software Foundation; either version 2
> + * of the License, or (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; if not, see <http://www.gnu.org/licenses/>.
> + */
> +#ifndef _ASM_ARM64_LIVEPATCH_H
> +#define _ASM_ARM64_LIVEPATCH_H
> +
> +#include <linux/module.h>
> +#include <linux/ftrace.h>

I think that only

#include <asm/ptrace.h>

is in fact needed, because of "struct pt_regs".


Ad relocations. I checked that everything in struct mod_arch_specific 
stays after the module is load. Both core and init get SHF_ALLOC set 
(mod->arch.core.plt->sh_flags in module_frob_arch_sections(). It is 
important because apply_relocate_add() may use those sections 
through module_emit_plt_entry() call.

ftrace_trampoline section gets SHF_ALLOC as well. Btw, don't you want to 
do something similar for new ftrace_regs_trampoline in 
module_frob_arch_sections()? Perhaps even edit 
arch/arm64/kernel/module.lds? I'm completely clueless here, so it may be 
ok. And it applies to 2/4 patch.

The last thing is count_plts() function called from 
module_frob_arch_sections(). It needed to be changed in 2016 as well. See 
Jessica's patch 
(20160713001113.GA30925@packer-debian-8-amd64.digitalocean.com). The logic 
in the function has changed since then. If I am not mistaken, 
count_plts() is fine as it is right now. It does not consider SHN_UNDEF 
anymore, it looks at the destination section (where a symbol should 
resolved to) only.

Jessica, could you doublecheck, please?

Torsten, please doublecheck all this as well and add a comment about all 
this to the changelog, so we don't have to analyze it from the beginning 
again in the future.

Thanks,
Miroslav

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v3 3/4] arm64: implement live patching
  2018-10-17 13:39   ` Miroslav Benes
@ 2018-10-18 12:58     ` Jessica Yu
  2018-10-19 11:59       ` Miroslav Benes
  2018-10-19 13:52       ` Ard Biesheuvel
  0 siblings, 2 replies; 48+ messages in thread
From: Jessica Yu @ 2018-10-18 12:58 UTC (permalink / raw)
  To: Miroslav Benes
  Cc: Torsten Duwe, Will Deacon, Catalin Marinas, Julien Thierry,
	Steven Rostedt, Josh Poimboeuf, Ingo Molnar, Ard Biesheuvel,
	Arnd Bergmann, AKASHI Takahiro, linux-arm-kernel, linux-kernel,
	live-patching

+++ Miroslav Benes [17/10/18 15:39 +0200]:
>On Mon, 1 Oct 2018, Torsten Duwe wrote:
>
>> Based on ftrace with regs, do the usual thing. Also allocate a
>> task flag for whatever consistency handling will be used.
>> Watch out for interactions with the graph tracer.
>
>Similar to what Mark wrote about 2/4, I'd appreciate a better commit log.
>Could you explain traditional "what/why/how", please?
>
>> Signed-off-by: Torsten Duwe <duwe@suse.de>
>>
>> --- a/arch/arm64/Kconfig
>> +++ b/arch/arm64/Kconfig
>> @@ -119,6 +119,7 @@ config ARM64
>>  	select HAVE_GENERIC_DMA_COHERENT
>>  	select HAVE_HW_BREAKPOINT if PERF_EVENTS
>>  	select HAVE_IRQ_TIME_ACCOUNTING
>> +	select HAVE_LIVEPATCH
>>  	select HAVE_MEMBLOCK
>>  	select HAVE_MEMBLOCK_NODE_MAP if NUMA
>>  	select HAVE_NMI
>> @@ -1349,4 +1350,6 @@ if CRYPTO
>>  source "arch/arm64/crypto/Kconfig"
>>  endif
>>
>> +source "kernel/livepatch/Kconfig"
>> +
>>  source "lib/Kconfig"
>> --- a/arch/arm64/include/asm/thread_info.h
>> +++ b/arch/arm64/include/asm/thread_info.h
>> @@ -76,6 +76,7 @@ void arch_release_task_struct(struct tas
>>  #define TIF_FOREIGN_FPSTATE	3	/* CPU's FP state is not current's */
>>  #define TIF_UPROBE		4	/* uprobe breakpoint or singlestep */
>>  #define TIF_FSCHECK		5	/* Check FS is USER_DS on return */
>> +#define TIF_PATCH_PENDING	6
>>  #define TIF_NOHZ		7
>>  #define TIF_SYSCALL_TRACE	8
>>  #define TIF_SYSCALL_AUDIT	9
>> @@ -94,6 +95,7 @@ void arch_release_task_struct(struct tas
>>  #define _TIF_NEED_RESCHED	(1 << TIF_NEED_RESCHED)
>>  #define _TIF_NOTIFY_RESUME	(1 << TIF_NOTIFY_RESUME)
>>  #define _TIF_FOREIGN_FPSTATE	(1 << TIF_FOREIGN_FPSTATE)
>> +#define _TIF_PATCH_PENDING	(1 << TIF_PATCH_PENDING)
>>  #define _TIF_NOHZ		(1 << TIF_NOHZ)
>>  #define _TIF_SYSCALL_TRACE	(1 << TIF_SYSCALL_TRACE)
>>  #define _TIF_SYSCALL_AUDIT	(1 << TIF_SYSCALL_AUDIT)
>> @@ -106,7 +108,8 @@ void arch_release_task_struct(struct tas
>>
>>  #define _TIF_WORK_MASK		(_TIF_NEED_RESCHED | _TIF_SIGPENDING | \
>>  				 _TIF_NOTIFY_RESUME | _TIF_FOREIGN_FPSTATE | \
>> -				 _TIF_UPROBE | _TIF_FSCHECK)
>> +				 _TIF_UPROBE | _TIF_FSCHECK | \
>> +				 _TIF_PATCH_PENDING)
>
>Could you add a note to the changelog what this means? My ability to read
>arm64 entry.S is very limited, but I can see that _TIF_WORK_MASK is
>process in a syscall exit and irq return paths. That's good. It is also
>called (via "b ret_to_user") in a couple of different places (el0_*
>labels). I guess those are returns from exception handling. A comment
>about it in the changelog would be appreciated.
>
>>  #define _TIF_SYSCALL_WORK	(_TIF_SYSCALL_TRACE | _TIF_SYSCALL_AUDIT | \
>>  				 _TIF_SYSCALL_TRACEPOINT | _TIF_SECCOMP | \
>> --- /dev/null
>> +++ b/arch/arm64/include/asm/livepatch.h
>> @@ -0,0 +1,36 @@
>> +/* SPDX-License-Identifier: GPL-2.0
>> + *
>> + * livepatch.h - arm64-specific Kernel Live Patching Core
>> + *
>> + * Copyright (C) 2016,2018 SUSE
>> + *
>> + * This program is free software; you can redistribute it and/or
>> + * modify it under the terms of the GNU General Public License
>> + * as published by the Free Software Foundation; either version 2
>> + * of the License, or (at your option) any later version.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + *
>> + * You should have received a copy of the GNU General Public License
>> + * along with this program; if not, see <http://www.gnu.org/licenses/>.
>> + */
>> +#ifndef _ASM_ARM64_LIVEPATCH_H
>> +#define _ASM_ARM64_LIVEPATCH_H
>> +
>> +#include <linux/module.h>
>> +#include <linux/ftrace.h>
>
>I think that only
>
>#include <asm/ptrace.h>
>
>is in fact needed, because of "struct pt_regs".
>
>
>Ad relocations. I checked that everything in struct mod_arch_specific
>stays after the module is load. Both core and init get SHF_ALLOC set
>(mod->arch.core.plt->sh_flags in module_frob_arch_sections(). It is
>important because apply_relocate_add() may use those sections
>through module_emit_plt_entry() call.

Yes, it looks like the needed .plt sections will remain in module
memory. However, I think I found a slight issue... :/

In module_frob_arch_sections(), mod->arch.core.plt is set to an offset
within info->sechdrs:

             if (!strcmp(secstrings + sechdrs[i].sh_name, ".plt"))
                     mod->arch.core.plt = sechdrs + i;

sechdrs is from info->sechdrs, which is freed at the end of
load_module() via free_copy(). So although the relevant plt section(s)
are in module memory, mod->arch.core.plt will point to invalid memory
after info is freed. In other words, the section (.plt) *is* in memory
but the section header (struct elf64_shdr) containing section metadata
like sh_addr, sh_size etc., is not.

But we have mod->klp_info->sechdrs (which is a copy of the section
headers) for this very reason. It is initialized only at the very end
of load_module(). I don't think copy_module_elf() is dependent on
anything, so it can just be moved earlier.

Note that we cannot set mod->arch.core.plt to mod->klp_info->sechdrs + i
during module_frob_arch_sections() because it is still too early, none
of the module sections have been copied and none of their sh_addr's
have been set to their final locations as this is all happening before
move_module() is called. So we can use a module_finalize() function
for arm64 to switch the mod->arch.core.plt to use the klp_info section
headers.

Maybe this will be more clear in the example fix below (completely
untested and unreviewed!):

diff --git a/arch/arm64/include/asm/module.h b/arch/arm64/include/asm/module.h
index 97d0ef12e2ff..150afc29171b 100644
--- a/arch/arm64/include/asm/module.h
+++ b/arch/arm64/include/asm/module.h
@@ -25,6 +25,7 @@ struct mod_plt_sec {
 	struct elf64_shdr	*plt;
 	int			plt_num_entries;
 	int			plt_max_entries;
+	int			plt_shndx;
 };
 
 struct mod_arch_specific {
diff --git a/arch/arm64/kernel/module-plts.c b/arch/arm64/kernel/module-plts.c
index f0690c2ca3e0..c23cef8f0165 100644
--- a/arch/arm64/kernel/module-plts.c
+++ b/arch/arm64/kernel/module-plts.c
@@ -196,6 +196,21 @@ static unsigned int count_plts(Elf64_Sym *syms, Elf64_Rela *rela, int num,
 	return ret;
 }
 
+int module_finalize(const Elf_Ehdr *hdr, const Elf_Shdr *sechdrs,
+		    struct module *mod)
+{
+	/*
+	 * Livepatch modules need to keep access to section headers to apply
+	 * relocations. Note mod->klp_info should have already been initialized
+	 * and all section sh_addr's should have their final addresses by the
+	 * time module_finalize() is called.
+	 */
+	if (is_livepatch_module(mod))
+		mod->arch.core.plt = mod->klp_info->sechdrs + mod->arch.core.plt_shndx;
+
+	return 0;
+}
+
 int module_frob_arch_sections(Elf_Ehdr *ehdr, Elf_Shdr *sechdrs,
 			      char *secstrings, struct module *mod)
 {
@@ -210,11 +225,13 @@ int module_frob_arch_sections(Elf_Ehdr *ehdr, Elf_Shdr *sechdrs,
 	 * entries. Record the symtab address as well.
 	 */
 	for (i = 0; i < ehdr->e_shnum; i++) {
-		if (!strcmp(secstrings + sechdrs[i].sh_name, ".plt"))
+		if (!strcmp(secstrings + sechdrs[i].sh_name, ".plt")) {
 			mod->arch.core.plt = sechdrs + i;
-		else if (!strcmp(secstrings + sechdrs[i].sh_name, ".init.plt"))
+			mod->arch.core.plt_shndx = i;
+		} else if (!strcmp(secstrings + sechdrs[i].sh_name, ".init.plt")) {
 			mod->arch.init.plt = sechdrs + i;
-		else if (IS_ENABLED(CONFIG_DYNAMIC_FTRACE) &&
+			mod->arch.init.plt_shndx = i;
+		} else if (IS_ENABLED(CONFIG_DYNAMIC_FTRACE) &&
 			 !strcmp(secstrings + sechdrs[i].sh_name,
 				 ".text.ftrace_trampoline"))
 			tramp = sechdrs + i;
diff --git a/kernel/module.c b/kernel/module.c
index 6746c85511fe..2fc4d74288dd 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -3363,6 +3363,12 @@ static int post_relocation(struct module *mod, const struct load_info *info)
 	/* Setup kallsyms-specific fields. */
 	add_kallsyms(mod, info);
 
+	if (is_livepatch_module(mod)) {
+		err = copy_module_elf(mod, info);
+		if (err < 0)
+			return err;
+	}
+
 	/* Arch-specific module finalizing. */
 	return module_finalize(info->hdr, info->sechdrs, mod);
 }
@@ -3775,12 +3781,6 @@ static int load_module(struct load_info *info, const char __user *uargs,
 	if (err < 0)
 		goto coming_cleanup;
 
-	if (is_livepatch_module(mod)) {
-		err = copy_module_elf(mod, info);
-		if (err < 0)
-			goto sysfs_cleanup;
-	}
-
 	/* Get rid of temporary copy. */
 	free_copy(info);

Thoughts? Does the fix make sense?

>The last thing is count_plts() function called from
>module_frob_arch_sections(). It needed to be changed in 2016 as well. See
>Jessica's patch
>(20160713001113.GA30925@packer-debian-8-amd64.digitalocean.com). The logic
>in the function has changed since then. If I am not mistaken,
>count_plts() is fine as it is right now. It does not consider SHN_UNDEF
>anymore, it looks at the destination section (where a symbol should
>resolved to) only.
>
>Jessica, could you doublecheck, please?

Yes, I think count_plts() is fine now in this case (because it is
always true that sym->st_shndx != dstidx for SHN_LIVEPATCH symbols)
and my old patch from 2016 is no longer needed.

Thanks,

Jessica

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v3 3/4] arm64: implement live patching
  2018-10-18 12:58     ` Jessica Yu
@ 2018-10-19 11:59       ` Miroslav Benes
  2018-10-19 12:18         ` Jessica Yu
  2018-10-19 13:46         ` Torsten Duwe
  2018-10-19 13:52       ` Ard Biesheuvel
  1 sibling, 2 replies; 48+ messages in thread
From: Miroslav Benes @ 2018-10-19 11:59 UTC (permalink / raw)
  To: Jessica Yu
  Cc: Torsten Duwe, Will Deacon, Catalin Marinas, Julien Thierry,
	Steven Rostedt, Josh Poimboeuf, Ingo Molnar, Ard Biesheuvel,
	Arnd Bergmann, AKASHI Takahiro, linux-arm-kernel, linux-kernel,
	live-patching

On Thu, 18 Oct 2018, Jessica Yu wrote:

> +++ Miroslav Benes [17/10/18 15:39 +0200]:
> >On Mon, 1 Oct 2018, Torsten Duwe wrote:
> >
> >Ad relocations. I checked that everything in struct mod_arch_specific
> >stays after the module is load. Both core and init get SHF_ALLOC set
> >(mod->arch.core.plt->sh_flags in module_frob_arch_sections(). It is
> >important because apply_relocate_add() may use those sections
> >through module_emit_plt_entry() call.
> 
> Yes, it looks like the needed .plt sections will remain in module
> memory. However, I think I found a slight issue... :/
> 
> In module_frob_arch_sections(), mod->arch.core.plt is set to an offset
> within info->sechdrs:
> 
>             if (!strcmp(secstrings + sechdrs[i].sh_name, ".plt"))
>                     mod->arch.core.plt = sechdrs + i;
> 
> sechdrs is from info->sechdrs, which is freed at the end of
> load_module() via free_copy(). So although the relevant plt section(s)
> are in module memory, mod->arch.core.plt will point to invalid memory
> after info is freed. In other words, the section (.plt) *is* in memory
> but the section header (struct elf64_shdr) containing section metadata
> like sh_addr, sh_size etc., is not.
> 
> But we have mod->klp_info->sechdrs (which is a copy of the section
> headers) for this very reason. It is initialized only at the very end
> of load_module(). I don't think copy_module_elf() is dependent on
> anything, so it can just be moved earlier.
> 
> Note that we cannot set mod->arch.core.plt to mod->klp_info->sechdrs + i
> during module_frob_arch_sections() because it is still too early, none
> of the module sections have been copied and none of their sh_addr's
> have been set to their final locations as this is all happening before
> move_module() is called. So we can use a module_finalize() function
> for arm64 to switch the mod->arch.core.plt to use the klp_info section
> headers.

Thanks for the analysis. You seem to be right.
 
> Maybe this will be more clear in the example fix below (completely
> untested and unreviewed!):
> 
> diff --git a/arch/arm64/include/asm/module.h b/arch/arm64/include/asm/module.h
> index 97d0ef12e2ff..150afc29171b 100644
> --- a/arch/arm64/include/asm/module.h
> +++ b/arch/arm64/include/asm/module.h
> @@ -25,6 +25,7 @@ struct mod_plt_sec {
> 	struct elf64_shdr	*plt;
> 	int			plt_num_entries;
> 	int			plt_max_entries;
> +	int			plt_shndx;
> };
> 
> struct mod_arch_specific {
> diff --git a/arch/arm64/kernel/module-plts.c b/arch/arm64/kernel/module-plts.c
> index f0690c2ca3e0..c23cef8f0165 100644
> --- a/arch/arm64/kernel/module-plts.c
> +++ b/arch/arm64/kernel/module-plts.c
> @@ -196,6 +196,21 @@ static unsigned int count_plts(Elf64_Sym *syms,
> Elf64_Rela *rela, int num,
> 	return ret;
> }
> 
> +int module_finalize(const Elf_Ehdr *hdr, const Elf_Shdr *sechdrs,
> +		    struct module *mod)
> +{
> +	/*
> +	 * Livepatch modules need to keep access to section headers to apply
> +	 * relocations. Note mod->klp_info should have already been
> initialized
> +	 * and all section sh_addr's should have their final addresses by the
> +	 * time module_finalize() is called.
> +	 */
> +	if (is_livepatch_module(mod))
> +		mod->arch.core.plt = mod->klp_info->sechdrs +
> mod->arch.core.plt_shndx;
> +
> +	return 0;
> +}

There is already module_finalize() in arch/arm64/kernel/module.c, so this 
should probably go there.

If I am not mistaken, we do not care for arch.init.plt in livepatch. Is 
that correct?

> int module_frob_arch_sections(Elf_Ehdr *ehdr, Elf_Shdr *sechdrs,
> 			      char *secstrings, struct module *mod)
> {
> @@ -210,11 +225,13 @@ int module_frob_arch_sections(Elf_Ehdr *ehdr, Elf_Shdr
> *sechdrs,
> 	 * entries. Record the symtab address as well.
> 	 */
> 	for (i = 0; i < ehdr->e_shnum; i++) {
> -		if (!strcmp(secstrings + sechdrs[i].sh_name, ".plt"))
> +		if (!strcmp(secstrings + sechdrs[i].sh_name, ".plt")) {
> 			mod->arch.core.plt = sechdrs + i;
> -		else if (!strcmp(secstrings + sechdrs[i].sh_name,
> ".init.plt"))
> +			mod->arch.core.plt_shndx = i;
> +		} else if (!strcmp(secstrings + sechdrs[i].sh_name,
> ".init.plt")) {
> 			mod->arch.init.plt = sechdrs + i;
> -		else if (IS_ENABLED(CONFIG_DYNAMIC_FTRACE) &&
> +			mod->arch.init.plt_shndx = i;

It is initialized here, but that's not necessarily a bad thing.

> +		} else if (IS_ENABLED(CONFIG_DYNAMIC_FTRACE) &&
> 			 !strcmp(secstrings + sechdrs[i].sh_name,
> 				 ".text.ftrace_trampoline"))
> 			tramp = sechdrs + i;
> diff --git a/kernel/module.c b/kernel/module.c
> index 6746c85511fe..2fc4d74288dd 100644
> --- a/kernel/module.c
> +++ b/kernel/module.c
> @@ -3363,6 +3363,12 @@ static int post_relocation(struct module *mod, const
> struct load_info *info)
> 	/* Setup kallsyms-specific fields. */
> 	add_kallsyms(mod, info);
> 
> +	if (is_livepatch_module(mod)) {
> +		err = copy_module_elf(mod, info);
> +		if (err < 0)
> +			return err;
> +	}
> +
> 	/* Arch-specific module finalizing. */
> 	return module_finalize(info->hdr, info->sechdrs, mod);
> }
> @@ -3775,12 +3781,6 @@ static int load_module(struct load_info *info, const
> char __user *uargs,
> 	if (err < 0)
> 		goto coming_cleanup;
> 
> -	if (is_livepatch_module(mod)) {
> -		err = copy_module_elf(mod, info);
> -		if (err < 0)
> -			goto sysfs_cleanup;
> -	}
> -
> 	/* Get rid of temporary copy. */
> 	free_copy(info);

Hm, this is hopefully all right. We'd need to change the error handling a 
bit. free_module_elf() is now called in free_module() and it should be 
moved somewhere to load_module(). Probably under free_arch_cleanup label. 
Although it would be much better to rework post_relocation() and handle it 
right there.

Something like

-	return module_finalize(info->hdr, info->sechdrs, mod);
+	err = module_finalize(info->hdr, info->sechdrs, mod);
+	if (err < 0)
+		free_module_elf();
+
+	return err;

> Thoughts? Does the fix make sense?

I think so.

> >The last thing is count_plts() function called from
> >module_frob_arch_sections(). It needed to be changed in 2016 as well. See
> >Jessica's patch
> >(20160713001113.GA30925@packer-debian-8-amd64.digitalocean.com). The logic
> >in the function has changed since then. If I am not mistaken,
> >count_plts() is fine as it is right now. It does not consider SHN_UNDEF
> >anymore, it looks at the destination section (where a symbol should
> >resolved to) only.
> >
> >Jessica, could you doublecheck, please?
> 
> Yes, I think count_plts() is fine now in this case (because it is
> always true that sym->st_shndx != dstidx for SHN_LIVEPATCH symbols)
> and my old patch from 2016 is no longer needed.

Thanks a lot, Jessica.

Torsten, could you include the outcome to your patch set once we settle on 
it? Thanks.

Miroslav

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v3 3/4] arm64: implement live patching
  2018-10-19 11:59       ` Miroslav Benes
@ 2018-10-19 12:18         ` Jessica Yu
  2018-10-19 15:14           ` Miroslav Benes
  2018-10-19 13:46         ` Torsten Duwe
  1 sibling, 1 reply; 48+ messages in thread
From: Jessica Yu @ 2018-10-19 12:18 UTC (permalink / raw)
  To: Miroslav Benes
  Cc: Torsten Duwe, Will Deacon, Catalin Marinas, Julien Thierry,
	Steven Rostedt, Josh Poimboeuf, Ingo Molnar, Ard Biesheuvel,
	Arnd Bergmann, AKASHI Takahiro, linux-arm-kernel, linux-kernel,
	live-patching

+++ Miroslav Benes [19/10/18 13:59 +0200]:
>On Thu, 18 Oct 2018, Jessica Yu wrote:
>
>> +++ Miroslav Benes [17/10/18 15:39 +0200]:
>> >On Mon, 1 Oct 2018, Torsten Duwe wrote:
>> >
>> >Ad relocations. I checked that everything in struct mod_arch_specific
>> >stays after the module is load. Both core and init get SHF_ALLOC set
>> >(mod->arch.core.plt->sh_flags in module_frob_arch_sections(). It is
>> >important because apply_relocate_add() may use those sections
>> >through module_emit_plt_entry() call.
>>
>> Yes, it looks like the needed .plt sections will remain in module
>> memory. However, I think I found a slight issue... :/
>>
>> In module_frob_arch_sections(), mod->arch.core.plt is set to an offset
>> within info->sechdrs:
>>
>>             if (!strcmp(secstrings + sechdrs[i].sh_name, ".plt"))
>>                     mod->arch.core.plt = sechdrs + i;
>>
>> sechdrs is from info->sechdrs, which is freed at the end of
>> load_module() via free_copy(). So although the relevant plt section(s)
>> are in module memory, mod->arch.core.plt will point to invalid memory
>> after info is freed. In other words, the section (.plt) *is* in memory
>> but the section header (struct elf64_shdr) containing section metadata
>> like sh_addr, sh_size etc., is not.
>>
>> But we have mod->klp_info->sechdrs (which is a copy of the section
>> headers) for this very reason. It is initialized only at the very end
>> of load_module(). I don't think copy_module_elf() is dependent on
>> anything, so it can just be moved earlier.
>>
>> Note that we cannot set mod->arch.core.plt to mod->klp_info->sechdrs + i
>> during module_frob_arch_sections() because it is still too early, none
>> of the module sections have been copied and none of their sh_addr's
>> have been set to their final locations as this is all happening before
>> move_module() is called. So we can use a module_finalize() function
>> for arm64 to switch the mod->arch.core.plt to use the klp_info section
>> headers.
>
>Thanks for the analysis. You seem to be right.
>
>> Maybe this will be more clear in the example fix below (completely
>> untested and unreviewed!):
>>
>> diff --git a/arch/arm64/include/asm/module.h b/arch/arm64/include/asm/module.h
>> index 97d0ef12e2ff..150afc29171b 100644
>> --- a/arch/arm64/include/asm/module.h
>> +++ b/arch/arm64/include/asm/module.h
>> @@ -25,6 +25,7 @@ struct mod_plt_sec {
>> 	struct elf64_shdr	*plt;
>> 	int			plt_num_entries;
>> 	int			plt_max_entries;
>> +	int			plt_shndx;
>> };
>>
>> struct mod_arch_specific {
>> diff --git a/arch/arm64/kernel/module-plts.c b/arch/arm64/kernel/module-plts.c
>> index f0690c2ca3e0..c23cef8f0165 100644
>> --- a/arch/arm64/kernel/module-plts.c
>> +++ b/arch/arm64/kernel/module-plts.c
>> @@ -196,6 +196,21 @@ static unsigned int count_plts(Elf64_Sym *syms,
>> Elf64_Rela *rela, int num,
>> 	return ret;
>> }
>>
>> +int module_finalize(const Elf_Ehdr *hdr, const Elf_Shdr *sechdrs,
>> +		    struct module *mod)
>> +{
>> +	/*
>> +	 * Livepatch modules need to keep access to section headers to apply
>> +	 * relocations. Note mod->klp_info should have already been
>> initialized
>> +	 * and all section sh_addr's should have their final addresses by the
>> +	 * time module_finalize() is called.
>> +	 */
>> +	if (is_livepatch_module(mod))
>> +		mod->arch.core.plt = mod->klp_info->sechdrs +
>> mod->arch.core.plt_shndx;
>> +
>> +	return 0;
>> +}
>
>There is already module_finalize() in arch/arm64/kernel/module.c, so this
>should probably go there.

Ah yeah, I missed that :) Yep, that would be where it'd go.

>If I am not mistaken, we do not care for arch.init.plt in livepatch. Is
>that correct?

I do not believe patching of __init functions is supported (right?) So
we do not need to keep arch.init.plt alive post-module-load.

>> int module_frob_arch_sections(Elf_Ehdr *ehdr, Elf_Shdr *sechdrs,
>> 			      char *secstrings, struct module *mod)
>> {
>> @@ -210,11 +225,13 @@ int module_frob_arch_sections(Elf_Ehdr *ehdr, Elf_Shdr
>> *sechdrs,
>> 	 * entries. Record the symtab address as well.
>> 	 */
>> 	for (i = 0; i < ehdr->e_shnum; i++) {
>> -		if (!strcmp(secstrings + sechdrs[i].sh_name, ".plt"))
>> +		if (!strcmp(secstrings + sechdrs[i].sh_name, ".plt")) {
>> 			mod->arch.core.plt = sechdrs + i;
>> -		else if (!strcmp(secstrings + sechdrs[i].sh_name,
>> ".init.plt"))
>> +			mod->arch.core.plt_shndx = i;
>> +		} else if (!strcmp(secstrings + sechdrs[i].sh_name,
>> ".init.plt")) {
>> 			mod->arch.init.plt = sechdrs + i;
>> -		else if (IS_ENABLED(CONFIG_DYNAMIC_FTRACE) &&
>> +			mod->arch.init.plt_shndx = i;
>
>It is initialized here, but that's not necessarily a bad thing.

I think I added this line for consistency, but I actually don't think
it is needed. We only would need to keep the section index for
arch.core.plt then.

>> +		} else if (IS_ENABLED(CONFIG_DYNAMIC_FTRACE) &&
>> 			 !strcmp(secstrings + sechdrs[i].sh_name,
>> 				 ".text.ftrace_trampoline"))
>> 			tramp = sechdrs + i;
>> diff --git a/kernel/module.c b/kernel/module.c
>> index 6746c85511fe..2fc4d74288dd 100644
>> --- a/kernel/module.c
>> +++ b/kernel/module.c
>> @@ -3363,6 +3363,12 @@ static int post_relocation(struct module *mod, const
>> struct load_info *info)
>> 	/* Setup kallsyms-specific fields. */
>> 	add_kallsyms(mod, info);
>>
>> +	if (is_livepatch_module(mod)) {
>> +		err = copy_module_elf(mod, info);
>> +		if (err < 0)
>> +			return err;
>> +	}
>> +
>> 	/* Arch-specific module finalizing. */
>> 	return module_finalize(info->hdr, info->sechdrs, mod);
>> }
>> @@ -3775,12 +3781,6 @@ static int load_module(struct load_info *info, const
>> char __user *uargs,
>> 	if (err < 0)
>> 		goto coming_cleanup;
>>
>> -	if (is_livepatch_module(mod)) {
>> -		err = copy_module_elf(mod, info);
>> -		if (err < 0)
>> -			goto sysfs_cleanup;
>> -	}
>> -
>> 	/* Get rid of temporary copy. */
>> 	free_copy(info);
>
>Hm, this is hopefully all right. We'd need to change the error handling a
>bit. free_module_elf() is now called in free_module() and it should be
>moved somewhere to load_module(). Probably under free_arch_cleanup label.
>Although it would be much better to rework post_relocation() and handle it
>right there.
>
>Something like
>
>-	return module_finalize(info->hdr, info->sechdrs, mod);
>+	err = module_finalize(info->hdr, info->sechdrs, mod);
>+	if (err < 0)
>+		free_module_elf();
>+
>+	return err;

Ah yeah, I'll clean up the error handling. I'll re-submit this as a
real patch after cleaning it up a bit and it could be added to this
series after review.

>> Thoughts? Does the fix make sense?
>
>I think so.

Thanks Miroslav for the impromptu review :-)

Jessica

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v3 3/4] arm64: implement live patching
  2018-10-19 11:59       ` Miroslav Benes
  2018-10-19 12:18         ` Jessica Yu
@ 2018-10-19 13:46         ` Torsten Duwe
  1 sibling, 0 replies; 48+ messages in thread
From: Torsten Duwe @ 2018-10-19 13:46 UTC (permalink / raw)
  To: Miroslav Benes
  Cc: Jessica Yu, Will Deacon, Catalin Marinas, Julien Thierry,
	Steven Rostedt, Josh Poimboeuf, Ingo Molnar, Ard Biesheuvel,
	Arnd Bergmann, AKASHI Takahiro, linux-arm-kernel, linux-kernel,
	live-patching

On Fri, Oct 19, 2018 at 01:59:01PM +0200, Miroslav Benes wrote:
> 
> Torsten, could you include the outcome to your patch set once we settle on 
> it? Thanks.

Absolutely! Whether as patch 4/4 or on its own and I refer to it -- we'll
figure it out.

	Torsten


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v3 3/4] arm64: implement live patching
  2018-10-18 12:58     ` Jessica Yu
  2018-10-19 11:59       ` Miroslav Benes
@ 2018-10-19 13:52       ` Ard Biesheuvel
  2018-10-19 15:21         ` Miroslav Benes
  1 sibling, 1 reply; 48+ messages in thread
From: Ard Biesheuvel @ 2018-10-19 13:52 UTC (permalink / raw)
  To: Jessica Yu
  Cc: Miroslav Benes, Torsten Duwe, Will Deacon, Catalin Marinas,
	Julien Thierry, Steven Rostedt, Josh Poimboeuf, Ingo Molnar,
	Arnd Bergmann, AKASHI Takahiro, linux-arm-kernel,
	Linux Kernel Mailing List, live-patching

On 18 October 2018 at 20:58, Jessica Yu <jeyu@kernel.org> wrote:
> +++ Miroslav Benes [17/10/18 15:39 +0200]:
>
>> On Mon, 1 Oct 2018, Torsten Duwe wrote:
>>
>>> Based on ftrace with regs, do the usual thing. Also allocate a
>>> task flag for whatever consistency handling will be used.
>>> Watch out for interactions with the graph tracer.
>>
>>
>> Similar to what Mark wrote about 2/4, I'd appreciate a better commit log.
>> Could you explain traditional "what/why/how", please?
>>
>>> Signed-off-by: Torsten Duwe <duwe@suse.de>
>>>
>>> --- a/arch/arm64/Kconfig
>>> +++ b/arch/arm64/Kconfig
>>> @@ -119,6 +119,7 @@ config ARM64
>>>         select HAVE_GENERIC_DMA_COHERENT
>>>         select HAVE_HW_BREAKPOINT if PERF_EVENTS
>>>         select HAVE_IRQ_TIME_ACCOUNTING
>>> +       select HAVE_LIVEPATCH
>>>         select HAVE_MEMBLOCK
>>>         select HAVE_MEMBLOCK_NODE_MAP if NUMA
>>>         select HAVE_NMI
>>> @@ -1349,4 +1350,6 @@ if CRYPTO
>>>  source "arch/arm64/crypto/Kconfig"
>>>  endif
>>>
>>> +source "kernel/livepatch/Kconfig"
>>> +
>>>  source "lib/Kconfig"
>>> --- a/arch/arm64/include/asm/thread_info.h
>>> +++ b/arch/arm64/include/asm/thread_info.h
>>> @@ -76,6 +76,7 @@ void arch_release_task_struct(struct tas
>>>  #define TIF_FOREIGN_FPSTATE    3       /* CPU's FP state is not
>>> current's */
>>>  #define TIF_UPROBE             4       /* uprobe breakpoint or
>>> singlestep */
>>>  #define TIF_FSCHECK            5       /* Check FS is USER_DS on return
>>> */
>>> +#define TIF_PATCH_PENDING      6
>>>  #define TIF_NOHZ               7
>>>  #define TIF_SYSCALL_TRACE      8
>>>  #define TIF_SYSCALL_AUDIT      9
>>> @@ -94,6 +95,7 @@ void arch_release_task_struct(struct tas
>>>  #define _TIF_NEED_RESCHED      (1 << TIF_NEED_RESCHED)
>>>  #define _TIF_NOTIFY_RESUME     (1 << TIF_NOTIFY_RESUME)
>>>  #define _TIF_FOREIGN_FPSTATE   (1 << TIF_FOREIGN_FPSTATE)
>>> +#define _TIF_PATCH_PENDING     (1 << TIF_PATCH_PENDING)
>>>  #define _TIF_NOHZ              (1 << TIF_NOHZ)
>>>  #define _TIF_SYSCALL_TRACE     (1 << TIF_SYSCALL_TRACE)
>>>  #define _TIF_SYSCALL_AUDIT     (1 << TIF_SYSCALL_AUDIT)
>>> @@ -106,7 +108,8 @@ void arch_release_task_struct(struct tas
>>>
>>>  #define _TIF_WORK_MASK         (_TIF_NEED_RESCHED | _TIF_SIGPENDING | \
>>>                                  _TIF_NOTIFY_RESUME |
>>> _TIF_FOREIGN_FPSTATE | \
>>> -                                _TIF_UPROBE | _TIF_FSCHECK)
>>> +                                _TIF_UPROBE | _TIF_FSCHECK | \
>>> +                                _TIF_PATCH_PENDING)
>>
>>
>> Could you add a note to the changelog what this means? My ability to read
>> arm64 entry.S is very limited, but I can see that _TIF_WORK_MASK is
>> process in a syscall exit and irq return paths. That's good. It is also
>> called (via "b ret_to_user") in a couple of different places (el0_*
>> labels). I guess those are returns from exception handling. A comment
>> about it in the changelog would be appreciated.
>>
>>>  #define _TIF_SYSCALL_WORK      (_TIF_SYSCALL_TRACE | _TIF_SYSCALL_AUDIT
>>> | \
>>>                                  _TIF_SYSCALL_TRACEPOINT | _TIF_SECCOMP |
>>> \
>>> --- /dev/null
>>> +++ b/arch/arm64/include/asm/livepatch.h
>>> @@ -0,0 +1,36 @@
>>> +/* SPDX-License-Identifier: GPL-2.0
>>> + *
>>> + * livepatch.h - arm64-specific Kernel Live Patching Core
>>> + *
>>> + * Copyright (C) 2016,2018 SUSE
>>> + *
>>> + * This program is free software; you can redistribute it and/or
>>> + * modify it under the terms of the GNU General Public License
>>> + * as published by the Free Software Foundation; either version 2
>>> + * of the License, or (at your option) any later version.
>>> + *
>>> + * This program is distributed in the hope that it will be useful,
>>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>>> + * GNU General Public License for more details.
>>> + *
>>> + * You should have received a copy of the GNU General Public License
>>> + * along with this program; if not, see <http://www.gnu.org/licenses/>.
>>> + */
>>> +#ifndef _ASM_ARM64_LIVEPATCH_H
>>> +#define _ASM_ARM64_LIVEPATCH_H
>>> +
>>> +#include <linux/module.h>
>>> +#include <linux/ftrace.h>
>>
>>
>> I think that only
>>
>> #include <asm/ptrace.h>
>>
>> is in fact needed, because of "struct pt_regs".
>>
>>
>> Ad relocations. I checked that everything in struct mod_arch_specific
>> stays after the module is load. Both core and init get SHF_ALLOC set
>> (mod->arch.core.plt->sh_flags in module_frob_arch_sections(). It is
>> important because apply_relocate_add() may use those sections
>> through module_emit_plt_entry() call.
>
>
> Yes, it looks like the needed .plt sections will remain in module
> memory. However, I think I found a slight issue... :/
>
> In module_frob_arch_sections(), mod->arch.core.plt is set to an offset
> within info->sechdrs:
>
>             if (!strcmp(secstrings + sechdrs[i].sh_name, ".plt"))
>                     mod->arch.core.plt = sechdrs + i;
>
> sechdrs is from info->sechdrs, which is freed at the end of
> load_module() via free_copy(). So although the relevant plt section(s)
> are in module memory, mod->arch.core.plt will point to invalid memory
> after info is freed. In other words, the section (.plt) *is* in memory
> but the section header (struct elf64_shdr) containing section metadata
> like sh_addr, sh_size etc., is not.
>

Just for my understanding: this only matters for live patching, right?

The original PLT support was implemented to support loading modules
outside of the -/+ 128 MB range of an arm64 relative branch/jump
instruction, and was later enhanced [for the same reason] to support
emitting a trampoline for the ftrace entrypoint.



> But we have mod->klp_info->sechdrs (which is a copy of the section
> headers) for this very reason. It is initialized only at the very end
> of load_module(). I don't think copy_module_elf() is dependent on
> anything, so it can just be moved earlier.
>
> Note that we cannot set mod->arch.core.plt to mod->klp_info->sechdrs + i
> during module_frob_arch_sections() because it is still too early, none
> of the module sections have been copied and none of their sh_addr's
> have been set to their final locations as this is all happening before
> move_module() is called. So we can use a module_finalize() function
> for arm64 to switch the mod->arch.core.plt to use the klp_info section
> headers.
>
> Maybe this will be more clear in the example fix below (completely
> untested and unreviewed!):
>
> diff --git a/arch/arm64/include/asm/module.h
> b/arch/arm64/include/asm/module.h
> index 97d0ef12e2ff..150afc29171b 100644
> --- a/arch/arm64/include/asm/module.h
> +++ b/arch/arm64/include/asm/module.h
> @@ -25,6 +25,7 @@ struct mod_plt_sec {
>         struct elf64_shdr       *plt;
>         int                     plt_num_entries;
>         int                     plt_max_entries;
> +       int                     plt_shndx;
> };
>
> struct mod_arch_specific {
> diff --git a/arch/arm64/kernel/module-plts.c
> b/arch/arm64/kernel/module-plts.c
> index f0690c2ca3e0..c23cef8f0165 100644
> --- a/arch/arm64/kernel/module-plts.c
> +++ b/arch/arm64/kernel/module-plts.c
> @@ -196,6 +196,21 @@ static unsigned int count_plts(Elf64_Sym *syms,
> Elf64_Rela *rela, int num,
>         return ret;
> }
>
> +int module_finalize(const Elf_Ehdr *hdr, const Elf_Shdr *sechdrs,
> +                   struct module *mod)
> +{
> +       /*
> +        * Livepatch modules need to keep access to section headers to apply
> +        * relocations. Note mod->klp_info should have already been
> initialized
> +        * and all section sh_addr's should have their final addresses by
> the
> +        * time module_finalize() is called.
> +        */
> +       if (is_livepatch_module(mod))
> +               mod->arch.core.plt = mod->klp_info->sechdrs +
> mod->arch.core.plt_shndx;
> +
> +       return 0;
> +}
> +
> int module_frob_arch_sections(Elf_Ehdr *ehdr, Elf_Shdr *sechdrs,
>                               char *secstrings, struct module *mod)
> {
> @@ -210,11 +225,13 @@ int module_frob_arch_sections(Elf_Ehdr *ehdr, Elf_Shdr
> *sechdrs,
>          * entries. Record the symtab address as well.
>          */
>         for (i = 0; i < ehdr->e_shnum; i++) {
> -               if (!strcmp(secstrings + sechdrs[i].sh_name, ".plt"))
> +               if (!strcmp(secstrings + sechdrs[i].sh_name, ".plt")) {
>                         mod->arch.core.plt = sechdrs + i;
> -               else if (!strcmp(secstrings + sechdrs[i].sh_name,
> ".init.plt"))
> +                       mod->arch.core.plt_shndx = i;
> +               } else if (!strcmp(secstrings + sechdrs[i].sh_name,
> ".init.plt")) {
>                         mod->arch.init.plt = sechdrs + i;
> -               else if (IS_ENABLED(CONFIG_DYNAMIC_FTRACE) &&
> +                       mod->arch.init.plt_shndx = i;
> +               } else if (IS_ENABLED(CONFIG_DYNAMIC_FTRACE) &&
>                          !strcmp(secstrings + sechdrs[i].sh_name,
>                                  ".text.ftrace_trampoline"))
>                         tramp = sechdrs + i;
> diff --git a/kernel/module.c b/kernel/module.c
> index 6746c85511fe..2fc4d74288dd 100644
> --- a/kernel/module.c
> +++ b/kernel/module.c
> @@ -3363,6 +3363,12 @@ static int post_relocation(struct module *mod, const
> struct load_info *info)
>         /* Setup kallsyms-specific fields. */
>         add_kallsyms(mod, info);
>
> +       if (is_livepatch_module(mod)) {
> +               err = copy_module_elf(mod, info);
> +               if (err < 0)
> +                       return err;
> +       }
> +
>         /* Arch-specific module finalizing. */
>         return module_finalize(info->hdr, info->sechdrs, mod);
> }
> @@ -3775,12 +3781,6 @@ static int load_module(struct load_info *info, const
> char __user *uargs,
>         if (err < 0)
>                 goto coming_cleanup;
>
> -       if (is_livepatch_module(mod)) {
> -               err = copy_module_elf(mod, info);
> -               if (err < 0)
> -                       goto sysfs_cleanup;
> -       }
> -
>         /* Get rid of temporary copy. */
>         free_copy(info);
>
> Thoughts? Does the fix make sense?
>
>> The last thing is count_plts() function called from
>> module_frob_arch_sections(). It needed to be changed in 2016 as well. See
>> Jessica's patch
>> (20160713001113.GA30925@packer-debian-8-amd64.digitalocean.com). The logic
>> in the function has changed since then. If I am not mistaken,
>> count_plts() is fine as it is right now. It does not consider SHN_UNDEF
>> anymore, it looks at the destination section (where a symbol should
>> resolved to) only.
>>
>> Jessica, could you doublecheck, please?
>
>
> Yes, I think count_plts() is fine now in this case (because it is
> always true that sym->st_shndx != dstidx for SHN_LIVEPATCH symbols)
> and my old patch from 2016 is no longer needed.
>
> Thanks,
>
> Jessica

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v3 3/4] arm64: implement live patching
  2018-10-19 12:18         ` Jessica Yu
@ 2018-10-19 15:14           ` Miroslav Benes
  0 siblings, 0 replies; 48+ messages in thread
From: Miroslav Benes @ 2018-10-19 15:14 UTC (permalink / raw)
  To: Jessica Yu
  Cc: Torsten Duwe, Will Deacon, Catalin Marinas, Julien Thierry,
	Steven Rostedt, Josh Poimboeuf, Ingo Molnar, Ard Biesheuvel,
	Arnd Bergmann, AKASHI Takahiro, linux-arm-kernel, linux-kernel,
	live-patching


> >If I am not mistaken, we do not care for arch.init.plt in livepatch. Is
> >that correct?
> 
> I do not believe patching of __init functions is supported (right?) So
> we do not need to keep arch.init.plt alive post-module-load.

I think we can do that. Theoretically. I'm not sure if it is actually 
useful. Module's init functions are called after the modules is patched, 
so there is no obstacle.

But arch.init.plt would be useful only for the relocations of the patching 
module, right? Patching functions would not part of init section anyway, I 
think, so arch.init.plt is useless post-module-load.
 
> >> int module_frob_arch_sections(Elf_Ehdr *ehdr, Elf_Shdr *sechdrs,
> >> 			      char *secstrings, struct module *mod)
> >> {
> >> @@ -210,11 +225,13 @@ int module_frob_arch_sections(Elf_Ehdr *ehdr,
> >> Elf_Shdr
> >> *sechdrs,
> >>   * entries. Record the symtab address as well.
> >>   */
> >> 	for (i = 0; i < ehdr->e_shnum; i++) {
> >> -		if (!strcmp(secstrings + sechdrs[i].sh_name, ".plt"))
> >> +		if (!strcmp(secstrings + sechdrs[i].sh_name, ".plt")) {
> >> 			mod->arch.core.plt = sechdrs + i;
> >> -		else if (!strcmp(secstrings + sechdrs[i].sh_name,
> >> ".init.plt"))
> >> +			mod->arch.core.plt_shndx = i;
> >> +		} else if (!strcmp(secstrings + sechdrs[i].sh_name,
> >> ".init.plt")) {
> >> 			mod->arch.init.plt = sechdrs + i;
> >> -		else if (IS_ENABLED(CONFIG_DYNAMIC_FTRACE) &&
> >> +			mod->arch.init.plt_shndx = i;
> >
> >It is initialized here, but that's not necessarily a bad thing.
> 
> I think I added this line for consistency, but I actually don't think
> it is needed. We only would need to keep the section index for
> arch.core.plt then.

Yes. I'd welcome a comment somewhere in both cases. Either that we 
initialize it just for consistency, or that we don't, because it's not 
needed.

Miroslav

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v3 3/4] arm64: implement live patching
  2018-10-19 13:52       ` Ard Biesheuvel
@ 2018-10-19 15:21         ` Miroslav Benes
  2018-10-20 14:10           ` Ard Biesheuvel
  0 siblings, 1 reply; 48+ messages in thread
From: Miroslav Benes @ 2018-10-19 15:21 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Jessica Yu, Torsten Duwe, Will Deacon, Catalin Marinas,
	Julien Thierry, Steven Rostedt, Josh Poimboeuf, Ingo Molnar,
	Arnd Bergmann, AKASHI Takahiro, linux-arm-kernel,
	Linux Kernel Mailing List, live-patching


> >> Ad relocations. I checked that everything in struct mod_arch_specific
> >> stays after the module is load. Both core and init get SHF_ALLOC set
> >> (mod->arch.core.plt->sh_flags in module_frob_arch_sections(). It is
> >> important because apply_relocate_add() may use those sections
> >> through module_emit_plt_entry() call.
> >
> >
> > Yes, it looks like the needed .plt sections will remain in module
> > memory. However, I think I found a slight issue... :/
> >
> > In module_frob_arch_sections(), mod->arch.core.plt is set to an offset
> > within info->sechdrs:
> >
> >             if (!strcmp(secstrings + sechdrs[i].sh_name, ".plt"))
> >                     mod->arch.core.plt = sechdrs + i;
> >
> > sechdrs is from info->sechdrs, which is freed at the end of
> > load_module() via free_copy(). So although the relevant plt section(s)
> > are in module memory, mod->arch.core.plt will point to invalid memory
> > after info is freed. In other words, the section (.plt) *is* in memory
> > but the section header (struct elf64_shdr) containing section metadata
> > like sh_addr, sh_size etc., is not.
> >
> 
> Just for my understanding: this only matters for live patching, right?

Yes. Live patching can do deferred relocations. When a module is loaded 
and livepatched, the relevant symbols in the patching module are resolved. 
We call apply_relocate_add(), which is arch-specific, and we must be sure 
that everything used by apply_relocate_add() is preserved in the patching 
module after it is loaded.
 
> The original PLT support was implemented to support loading modules
> outside of the -/+ 128 MB range of an arm64 relative branch/jump
> instruction, and was later enhanced [for the same reason] to support
> emitting a trampoline for the ftrace entrypoint.

Regards,
Miroslav

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v3 3/4] arm64: implement live patching
  2018-10-19 15:21         ` Miroslav Benes
@ 2018-10-20 14:10           ` Ard Biesheuvel
  2018-10-22 12:53             ` Miroslav Benes
  0 siblings, 1 reply; 48+ messages in thread
From: Ard Biesheuvel @ 2018-10-20 14:10 UTC (permalink / raw)
  To: Miroslav Benes
  Cc: Jessica Yu, Torsten Duwe, Will Deacon, Catalin Marinas,
	Julien Thierry, Steven Rostedt, Josh Poimboeuf, Ingo Molnar,
	Arnd Bergmann, AKASHI Takahiro, linux-arm-kernel,
	Linux Kernel Mailing List, live-patching

On 19 October 2018 at 23:21, Miroslav Benes <mbenes@suse.cz> wrote:
>
>> >> Ad relocations. I checked that everything in struct mod_arch_specific
>> >> stays after the module is load. Both core and init get SHF_ALLOC set
>> >> (mod->arch.core.plt->sh_flags in module_frob_arch_sections(). It is
>> >> important because apply_relocate_add() may use those sections
>> >> through module_emit_plt_entry() call.
>> >
>> >
>> > Yes, it looks like the needed .plt sections will remain in module
>> > memory. However, I think I found a slight issue... :/
>> >
>> > In module_frob_arch_sections(), mod->arch.core.plt is set to an offset
>> > within info->sechdrs:
>> >
>> >             if (!strcmp(secstrings + sechdrs[i].sh_name, ".plt"))
>> >                     mod->arch.core.plt = sechdrs + i;
>> >
>> > sechdrs is from info->sechdrs, which is freed at the end of
>> > load_module() via free_copy(). So although the relevant plt section(s)
>> > are in module memory, mod->arch.core.plt will point to invalid memory
>> > after info is freed. In other words, the section (.plt) *is* in memory
>> > but the section header (struct elf64_shdr) containing section metadata
>> > like sh_addr, sh_size etc., is not.
>> >
>>
>> Just for my understanding: this only matters for live patching, right?
>
> Yes. Live patching can do deferred relocations. When a module is loaded
> and livepatched, the relevant symbols in the patching module are resolved.
> We call apply_relocate_add(), which is arch-specific, and we must be sure
> that everything used by apply_relocate_add() is preserved in the patching
> module after it is loaded.
>

So I suppose this could get interesting in cases where modules are far
away from the kernel (i.e., more than -/+ 128 MB). Fortunately, the
modules themselves are always placed in a 128 MB window, but this
window could be out of reach for branches into the kernel proper. If
we find ourselves in the situation where we need to patch calls into
the kernel proper to point into this module, this may get interesting,
since the PLT entries *must* be allocated along with the module that
contains the branch instruction, or the PLT entry itself may be out of
range, defeating the purpose.


>> The original PLT support was implemented to support loading modules
>> outside of the -/+ 128 MB range of an arm64 relative branch/jump
>> instruction, and was later enhanced [for the same reason] to support
>> emitting a trampoline for the ftrace entrypoint.
>
> Regards,
> Miroslav

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v3 3/4] arm64: implement live patching
  2018-10-20 14:10           ` Ard Biesheuvel
@ 2018-10-22 12:53             ` Miroslav Benes
  2018-10-22 14:54               ` Torsten Duwe
  0 siblings, 1 reply; 48+ messages in thread
From: Miroslav Benes @ 2018-10-22 12:53 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Jessica Yu, Torsten Duwe, Will Deacon, Catalin Marinas,
	Julien Thierry, Steven Rostedt, Josh Poimboeuf, Ingo Molnar,
	Arnd Bergmann, AKASHI Takahiro, linux-arm-kernel,
	Linux Kernel Mailing List, live-patching

On Sat, 20 Oct 2018, Ard Biesheuvel wrote:

> On 19 October 2018 at 23:21, Miroslav Benes <mbenes@suse.cz> wrote:
> >
> >> >> Ad relocations. I checked that everything in struct mod_arch_specific
> >> >> stays after the module is load. Both core and init get SHF_ALLOC set
> >> >> (mod->arch.core.plt->sh_flags in module_frob_arch_sections(). It is
> >> >> important because apply_relocate_add() may use those sections
> >> >> through module_emit_plt_entry() call.
> >> >
> >> >
> >> > Yes, it looks like the needed .plt sections will remain in module
> >> > memory. However, I think I found a slight issue... :/
> >> >
> >> > In module_frob_arch_sections(), mod->arch.core.plt is set to an offset
> >> > within info->sechdrs:
> >> >
> >> >             if (!strcmp(secstrings + sechdrs[i].sh_name, ".plt"))
> >> >                     mod->arch.core.plt = sechdrs + i;
> >> >
> >> > sechdrs is from info->sechdrs, which is freed at the end of
> >> > load_module() via free_copy(). So although the relevant plt section(s)
> >> > are in module memory, mod->arch.core.plt will point to invalid memory
> >> > after info is freed. In other words, the section (.plt) *is* in memory
> >> > but the section header (struct elf64_shdr) containing section metadata
> >> > like sh_addr, sh_size etc., is not.
> >> >
> >>
> >> Just for my understanding: this only matters for live patching, right?
> >
> > Yes. Live patching can do deferred relocations. When a module is loaded
> > and livepatched, the relevant symbols in the patching module are resolved.
> > We call apply_relocate_add(), which is arch-specific, and we must be sure
> > that everything used by apply_relocate_add() is preserved in the patching
> > module after it is loaded.
> >
> 
> So I suppose this could get interesting in cases where modules are far
> away from the kernel (i.e., more than -/+ 128 MB). Fortunately, the
> modules themselves are always placed in a 128 MB window, but this
> window could be out of reach for branches into the kernel proper. If
> we find ourselves in the situation where we need to patch calls into
> the kernel proper to point into this module, this may get interesting,
> since the PLT entries *must* be allocated along with the module that
> contains the branch instruction, or the PLT entry itself may be out of
> range, defeating the purpose.

Hm... Torsten, didn't you have to solve something similar in powerpc case? 

> >> The original PLT support was implemented to support loading modules
> >> outside of the -/+ 128 MB range of an arm64 relative branch/jump
> >> instruction, and was later enhanced [for the same reason] to support
> >> emitting a trampoline for the ftrace entrypoint.


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v3 3/4] arm64: implement live patching
  2018-10-22 12:53             ` Miroslav Benes
@ 2018-10-22 14:54               ` Torsten Duwe
  0 siblings, 0 replies; 48+ messages in thread
From: Torsten Duwe @ 2018-10-22 14:54 UTC (permalink / raw)
  To: Miroslav Benes
  Cc: Ard Biesheuvel, Jessica Yu, Will Deacon, Catalin Marinas,
	Julien Thierry, Steven Rostedt, Josh Poimboeuf, Ingo Molnar,
	Arnd Bergmann, AKASHI Takahiro, linux-arm-kernel,
	Linux Kernel Mailing List, live-patching

On Mon, Oct 22, 2018 at 02:53:10PM +0200, Miroslav Benes wrote:
> On Sat, 20 Oct 2018, Ard Biesheuvel wrote:
> > So I suppose this could get interesting in cases where modules are far
> > away from the kernel (i.e., more than -/+ 128 MB). Fortunately, the
> > modules themselves are always placed in a 128 MB window, but this
> > window could be out of reach for branches into the kernel proper. If
> > we find ourselves in the situation where we need to patch calls into
> > the kernel proper to point into this module, this may get interesting,
> > since the PLT entries *must* be allocated along with the module that
> > contains the branch instruction, or the PLT entry itself may be out of
> > range, defeating the purpose.
> 
> Hm... Torsten, didn't you have to solve something similar in powerpc case? 

At address ranges: that was the TOC pointer issue, that is an array of
addresses and other 64-bit constants with a reference to it kept in a register,
whose value is local to any module. PPC64 needs 5 instructions to load a 64-bit
value immediate, arm has the ldr_l macro. So in short: no. We were discussing a
few nasty tricks to reconstruct the TOC value from the code addresses, but
ended up with a solution that is clean in that respect.

At PLTs: ppc64 uses, IIRC, a regular trampoline slot to get to all called
functions, including ftrace and friends. So for e.g. live patch kernels
the total number, as predetermined, is only incremented by 2. Only the
trampoline _code_ for ftrace was modified.
I assume with "branches" you mean "branch with link" a.k.a a subroutine call?
All references to external symbols are allocated a PLT entry, right?

ftrace / live patching calls are alway intercepted at the called function.
                                       ===========--------======---------
The interceptor then uses a special trampoline to go to ftrace_caller, which
does the rest.

Am I missing something?

	Torsten


^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH] arm64/module: use mod->klp_info section header information
  2018-10-01 14:16 ` [PATCH v3 3/4] arm64: implement live patching Torsten Duwe
  2018-10-17 13:39   ` Miroslav Benes
@ 2018-10-23 17:55   ` Jessica Yu
  2018-10-23 19:32     ` kbuild test robot
                       ` (3 more replies)
  2018-11-05 17:57   ` [PATCH] arm64/module: use plt section indices for relocations Jessica Yu
  2 siblings, 4 replies; 48+ messages in thread
From: Jessica Yu @ 2018-10-23 17:55 UTC (permalink / raw)
  To: Torsten Duwe
  Cc: Will Deacon, Catalin Marinas, Julien Thierry, Steven Rostedt,
	Josh Poimboeuf, Ingo Molnar, Ard Biesheuvel, Arnd Bergmann,
	AKASHI Takahiro, linux-arm-kernel, linux-kernel, live-patching

The arm64 module loader keeps a pointer into info->sechdrs to keep track
of section header information for .plt section(s). A pointer to the
relevent section header (struct elf64_shdr) in info->sechdrs is stored
in mod->arch.{init,core}.plt. This pointer may be accessed while
applying relocations in apply_relocate_add() for example. And unlike
normal modules, livepatch modules can call apply_relocate_add() after
module load. But the info struct (and therefore info->sechdrs) gets
freed at the end of load_module() and so mod->arch.{init,core}.plt
becomes an invalid pointer after the module is done loading.

Luckily, livepatch modules already keep a copy of Elf section header
information in mod->klp_info. So make sure livepatch modules on arm64
have access to the section headers in klp_info and set
mod->arch.{init,core}.plt to the appropriate section header in
mod->klp_info so that they can call apply_relocate_add() even after
module load.

Signed-off-by: Jessica Yu <jeyu@kernel.org>
---

Hi!

This patch may be applied on top or merged with the 3rd patch. I
incoporated Miroslav's suggestions from the discussion. It's needed in
order for livepatch modules on arm64 to be able to call
apply_relocate_add() post-module-load, otherwise we could end up
accessing invalid pointers from apply_relocate_add().

 arch/arm64/include/asm/module.h |  1 +
 arch/arm64/kernel/module-plts.c | 10 ++++++++--
 arch/arm64/kernel/module.c      | 10 ++++++++++
 kernel/module.c                 | 22 +++++++++++++---------
 4 files changed, 32 insertions(+), 11 deletions(-)

diff --git a/arch/arm64/include/asm/module.h b/arch/arm64/include/asm/module.h
index fef773c94e9d..ac9b97f9ae5e 100644
--- a/arch/arm64/include/asm/module.h
+++ b/arch/arm64/include/asm/module.h
@@ -25,6 +25,7 @@ struct mod_plt_sec {
 	struct elf64_shdr	*plt;
 	int			plt_num_entries;
 	int			plt_max_entries;
+	int			plt_shndx;
 };
 
 struct mod_arch_specific {
diff --git a/arch/arm64/kernel/module-plts.c b/arch/arm64/kernel/module-plts.c
index f0690c2ca3e0..05067717dfc5 100644
--- a/arch/arm64/kernel/module-plts.c
+++ b/arch/arm64/kernel/module-plts.c
@@ -210,9 +210,15 @@ int module_frob_arch_sections(Elf_Ehdr *ehdr, Elf_Shdr *sechdrs,
 	 * entries. Record the symtab address as well.
 	 */
 	for (i = 0; i < ehdr->e_shnum; i++) {
-		if (!strcmp(secstrings + sechdrs[i].sh_name, ".plt"))
+		if (!strcmp(secstrings + sechdrs[i].sh_name, ".plt")) {
 			mod->arch.core.plt = sechdrs + i;
-		else if (!strcmp(secstrings + sechdrs[i].sh_name, ".init.plt"))
+			/*
+			 * Keep the section index for the .plt section for
+			 * livepatching. Note that .init.plt is irrelevant to
+			 * livepatch, so only the shndx for .plt is saved.
+			 */
+			mod->arch.core.plt_shndx = i;
+		} else if (!strcmp(secstrings + sechdrs[i].sh_name, ".init.plt"))
 			mod->arch.init.plt = sechdrs + i;
 		else if (IS_ENABLED(CONFIG_DYNAMIC_FTRACE) &&
 			 !strcmp(secstrings + sechdrs[i].sh_name,
diff --git a/arch/arm64/kernel/module.c b/arch/arm64/kernel/module.c
index dd23655fda3a..490e56070a7e 100644
--- a/arch/arm64/kernel/module.c
+++ b/arch/arm64/kernel/module.c
@@ -461,5 +461,15 @@ int module_finalize(const Elf_Ehdr *hdr,
 #endif
 	}
 
+#ifdef CONFIG_LIVEPATCH
+	/*
+	 * For livepatching, switch to the saved section header info for .plt
+	 * stored in mod->klp_info. This is needed so that livepatch is able to
+	 * call apply_relocate_add() after patch module load.
+	 */
+	if (is_livepatch_module(me))
+		me->arch.core.plt = me->klp_info->sechdrs + me->arch.core.plt_shndx;
+#endif
+
 	return 0;
 }
diff --git a/kernel/module.c b/kernel/module.c
index f475f30eed8c..f3ac04cc9fc3 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -3367,6 +3367,8 @@ int __weak module_finalize(const Elf_Ehdr *hdr,
 
 static int post_relocation(struct module *mod, const struct load_info *info)
 {
+	int err;
+
 	/* Sort exception table now relocations are done. */
 	sort_extable(mod->extable, mod->extable + mod->num_exentries);
 
@@ -3377,8 +3379,18 @@ static int post_relocation(struct module *mod, const struct load_info *info)
 	/* Setup kallsyms-specific fields. */
 	add_kallsyms(mod, info);
 
+	if (is_livepatch_module(mod)) {
+		err = copy_module_elf(mod, info);
+		if (err < 0)
+			return err;
+	}
+
 	/* Arch-specific module finalizing. */
-	return module_finalize(info->hdr, info->sechdrs, mod);
+	err = module_finalize(info->hdr, info->sechdrs, mod);
+	if (err < 0)
+		free_module_elf(mod);
+
+	return err;
 }
 
 /* Is this module of this name done loading?  No locks held. */
@@ -3770,12 +3782,6 @@ static int load_module(struct load_info *info, const char __user *uargs,
 	if (err < 0)
 		goto coming_cleanup;
 
-	if (is_livepatch_module(mod)) {
-		err = copy_module_elf(mod, info);
-		if (err < 0)
-			goto sysfs_cleanup;
-	}
-
 	/* Get rid of temporary copy. */
 	free_copy(info);
 
@@ -3784,8 +3790,6 @@ static int load_module(struct load_info *info, const char __user *uargs,
 
 	return do_init_module(mod);
 
- sysfs_cleanup:
-	mod_sysfs_teardown(mod);
  coming_cleanup:
 	mod->state = MODULE_STATE_GOING;
 	destroy_params(mod->kp, mod->num_kp);
-- 
2.16.4


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH] arm64/module: use mod->klp_info section header information
  2018-10-23 17:55   ` [PATCH] arm64/module: use mod->klp_info section header information Jessica Yu
@ 2018-10-23 19:32     ` kbuild test robot
  2018-10-24 11:57     ` Miroslav Benes
                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 48+ messages in thread
From: kbuild test robot @ 2018-10-23 19:32 UTC (permalink / raw)
  To: Jessica Yu
  Cc: kbuild-all, Torsten Duwe, Will Deacon, Catalin Marinas,
	Julien Thierry, Steven Rostedt, Josh Poimboeuf, Ingo Molnar,
	Ard Biesheuvel, Arnd Bergmann, AKASHI Takahiro, linux-arm-kernel,
	linux-kernel, live-patching

[-- Attachment #1: Type: text/plain, Size: 2301 bytes --]

Hi Jessica,

I love your patch! Perhaps something to improve:

[auto build test WARNING on arm64/for-next/core]
[also build test WARNING on v4.19 next-20181019]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Jessica-Yu/arm64-module-use-mod-klp_info-section-header-information/20181024-023709
base:   https://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git for-next/core
config: x86_64-randconfig-x002-201842 (attached as .config)
compiler: gcc-7 (Debian 7.3.0-1) 7.3.0
reproduce:
        # save the attached .config to linux build tree
        make ARCH=x86_64 

All warnings (new ones prefixed by >>):

   kernel/module.c: In function 'post_relocation':
>> kernel/module.c:3369:30: warning: passing argument 2 of 'copy_module_elf' discards 'const' qualifier from pointer target type [-Wdiscarded-qualifiers]
      err = copy_module_elf(mod, info);
                                 ^~~~
   kernel/module.c:2103:12: note: expected 'struct load_info *' but argument is of type 'const struct load_info *'
    static int copy_module_elf(struct module *mod, struct load_info *info)
               ^~~~~~~~~~~~~~~

vim +3369 kernel/module.c

  3353	
  3354	static int post_relocation(struct module *mod, const struct load_info *info)
  3355	{
  3356		int err;
  3357	
  3358		/* Sort exception table now relocations are done. */
  3359		sort_extable(mod->extable, mod->extable + mod->num_exentries);
  3360	
  3361		/* Copy relocated percpu area over. */
  3362		percpu_modcopy(mod, (void *)info->sechdrs[info->index.pcpu].sh_addr,
  3363			       info->sechdrs[info->index.pcpu].sh_size);
  3364	
  3365		/* Setup kallsyms-specific fields. */
  3366		add_kallsyms(mod, info);
  3367	
  3368		if (is_livepatch_module(mod)) {
> 3369			err = copy_module_elf(mod, info);
  3370			if (err < 0)
  3371				return err;
  3372		}
  3373	
  3374		/* Arch-specific module finalizing. */
  3375		err = module_finalize(info->hdr, info->sechdrs, mod);
  3376		if (err < 0)
  3377			free_module_elf(mod);
  3378	
  3379		return err;
  3380	}
  3381	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 31980 bytes --]

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH] arm64/module: use mod->klp_info section header information
  2018-10-23 17:55   ` [PATCH] arm64/module: use mod->klp_info section header information Jessica Yu
  2018-10-23 19:32     ` kbuild test robot
@ 2018-10-24 11:57     ` Miroslav Benes
  2018-10-25  8:08     ` Petr Mladek
  2018-10-26 17:25     ` [PATCH v2] arm64/module: use mod->klp_info section header information for livepatch modules Jessica Yu
  3 siblings, 0 replies; 48+ messages in thread
From: Miroslav Benes @ 2018-10-24 11:57 UTC (permalink / raw)
  To: Jessica Yu
  Cc: Torsten Duwe, Will Deacon, Catalin Marinas, Julien Thierry,
	Steven Rostedt, Josh Poimboeuf, Ingo Molnar, Ard Biesheuvel,
	Arnd Bergmann, AKASHI Takahiro, linux-arm-kernel, linux-kernel,
	live-patching

On Tue, 23 Oct 2018, Jessica Yu wrote:

> diff --git a/arch/arm64/kernel/module-plts.c b/arch/arm64/kernel/module-plts.c
> index f0690c2ca3e0..05067717dfc5 100644
> --- a/arch/arm64/kernel/module-plts.c
> +++ b/arch/arm64/kernel/module-plts.c
> @@ -210,9 +210,15 @@ int module_frob_arch_sections(Elf_Ehdr *ehdr, Elf_Shdr
> *sechdrs,
> 	 * entries. Record the symtab address as well.
> 	 */
> 	for (i = 0; i < ehdr->e_shnum; i++) {
> -		if (!strcmp(secstrings + sechdrs[i].sh_name, ".plt"))
> +		if (!strcmp(secstrings + sechdrs[i].sh_name, ".plt")) {
> 			mod->arch.core.plt = sechdrs + i;
> -		else if (!strcmp(secstrings + sechdrs[i].sh_name,
> ".init.plt"))
> +			/*
> +			 * Keep the section index for the .plt section for
> +			 * livepatching. Note that .init.plt is irrelevant to
> +			 * livepatch, so only the shndx for .plt is saved.
> +			 */
> +			mod->arch.core.plt_shndx = i;
> +		} else if (!strcmp(secstrings + sechdrs[i].sh_name,
> ".init.plt"))

Else branches should have braces now too according to the coding style.

Otherwise it looks good to me.

Thanks,
Miroslav

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH] arm64/module: use mod->klp_info section header information
  2018-10-23 17:55   ` [PATCH] arm64/module: use mod->klp_info section header information Jessica Yu
  2018-10-23 19:32     ` kbuild test robot
  2018-10-24 11:57     ` Miroslav Benes
@ 2018-10-25  8:08     ` Petr Mladek
  2018-10-25  9:00       ` Miroslav Benes
  2018-10-26 17:25     ` [PATCH v2] arm64/module: use mod->klp_info section header information for livepatch modules Jessica Yu
  3 siblings, 1 reply; 48+ messages in thread
From: Petr Mladek @ 2018-10-25  8:08 UTC (permalink / raw)
  To: Jessica Yu
  Cc: Torsten Duwe, Will Deacon, Catalin Marinas, Julien Thierry,
	Steven Rostedt, Josh Poimboeuf, Ingo Molnar, Ard Biesheuvel,
	Arnd Bergmann, AKASHI Takahiro, linux-arm-kernel, linux-kernel,
	live-patching

On Tue 2018-10-23 19:55:54, Jessica Yu wrote:
> The arm64 module loader keeps a pointer into info->sechdrs to keep track
> of section header information for .plt section(s). A pointer to the
> relevent section header (struct elf64_shdr) in info->sechdrs is stored
> in mod->arch.{init,core}.plt. This pointer may be accessed while
> applying relocations in apply_relocate_add() for example. And unlike
> normal modules, livepatch modules can call apply_relocate_add() after
> module load. But the info struct (and therefore info->sechdrs) gets
> freed at the end of load_module() and so mod->arch.{init,core}.plt
> becomes an invalid pointer after the module is done loading.
> 
> Luckily, livepatch modules already keep a copy of Elf section header
> information in mod->klp_info. So make sure livepatch modules on arm64
> have access to the section headers in klp_info and set
> mod->arch.{init,core}.plt to the appropriate section header in
> mod->klp_info so that they can call apply_relocate_add() even after
> module load.
> 
> diff --git a/kernel/module.c b/kernel/module.c
> index f475f30eed8c..f3ac04cc9fc3 100644
> --- a/kernel/module.c
> +++ b/kernel/module.c
> @@ -3367,6 +3367,8 @@ int __weak module_finalize(const Elf_Ehdr *hdr,
> 
> static int post_relocation(struct module *mod, const struct load_info *info)
> {
> +	int err;
> +
> 	/* Sort exception table now relocations are done. */
> 	sort_extable(mod->extable, mod->extable + mod->num_exentries);
> 
> @@ -3377,8 +3379,18 @@ static int post_relocation(struct module *mod, const struct load_info *info)
> 	/* Setup kallsyms-specific fields. */
> 	add_kallsyms(mod, info);
> 
> +	if (is_livepatch_module(mod)) {
> +		err = copy_module_elf(mod, info);
> +		if (err < 0)
> +			return err;
> +	}
> +
> 	/* Arch-specific module finalizing. */
> -	return module_finalize(info->hdr, info->sechdrs, mod);
> +	err = module_finalize(info->hdr, info->sechdrs, mod);
> +	if (err < 0)

	if (err < 0 && is_livepatch_module(mod))

> +		free_module_elf(mod);
> +
> +	return err;
> }

Also we need to free the copied stuff in load_module() when
anything called after post_relocation() fails. I think
that the following would work:

--- a/kernel/module.c
+++ b/kernel/module.c
@@ -3823,6 +3823,8 @@ static int load_module(struct load_info *info, const char __user *uargs,
 	kfree(mod->args);
  free_arch_cleanup:
 	module_arch_cleanup(mod);
+	if (is_livepatch_module(mod))
+		free_module_elf(mod);
  free_modinfo:
 	free_modinfo(mod);
  free_unload:


But I suggest to just move copy_module_elf() up and keep
calling it from load_module() directly. It would make
the error handling more clear.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH] arm64/module: use mod->klp_info section header information
  2018-10-25  8:08     ` Petr Mladek
@ 2018-10-25  9:00       ` Miroslav Benes
  2018-10-25 11:42         ` Jessica Yu
  0 siblings, 1 reply; 48+ messages in thread
From: Miroslav Benes @ 2018-10-25  9:00 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Jessica Yu, Torsten Duwe, Will Deacon, Catalin Marinas,
	Julien Thierry, Steven Rostedt, Josh Poimboeuf, Ingo Molnar,
	Ard Biesheuvel, Arnd Bergmann, AKASHI Takahiro, linux-arm-kernel,
	linux-kernel, live-patching

On Thu, 25 Oct 2018, Petr Mladek wrote:

> On Tue 2018-10-23 19:55:54, Jessica Yu wrote:
> > The arm64 module loader keeps a pointer into info->sechdrs to keep track
> > of section header information for .plt section(s). A pointer to the
> > relevent section header (struct elf64_shdr) in info->sechdrs is stored
> > in mod->arch.{init,core}.plt. This pointer may be accessed while
> > applying relocations in apply_relocate_add() for example. And unlike
> > normal modules, livepatch modules can call apply_relocate_add() after
> > module load. But the info struct (and therefore info->sechdrs) gets
> > freed at the end of load_module() and so mod->arch.{init,core}.plt
> > becomes an invalid pointer after the module is done loading.
> > 
> > Luckily, livepatch modules already keep a copy of Elf section header
> > information in mod->klp_info. So make sure livepatch modules on arm64
> > have access to the section headers in klp_info and set
> > mod->arch.{init,core}.plt to the appropriate section header in
> > mod->klp_info so that they can call apply_relocate_add() even after
> > module load.
> > 
> > diff --git a/kernel/module.c b/kernel/module.c
> > index f475f30eed8c..f3ac04cc9fc3 100644
> > --- a/kernel/module.c
> > +++ b/kernel/module.c
> > @@ -3367,6 +3367,8 @@ int __weak module_finalize(const Elf_Ehdr *hdr,
> > 
> > static int post_relocation(struct module *mod, const struct load_info *info)
> > {
> > +	int err;
> > +
> > 	/* Sort exception table now relocations are done. */
> > 	sort_extable(mod->extable, mod->extable + mod->num_exentries);
> > 
> > @@ -3377,8 +3379,18 @@ static int post_relocation(struct module *mod, const struct load_info *info)
> > 	/* Setup kallsyms-specific fields. */
> > 	add_kallsyms(mod, info);
> > 
> > +	if (is_livepatch_module(mod)) {
> > +		err = copy_module_elf(mod, info);
> > +		if (err < 0)
> > +			return err;
> > +	}
> > +
> > 	/* Arch-specific module finalizing. */
> > -	return module_finalize(info->hdr, info->sechdrs, mod);
> > +	err = module_finalize(info->hdr, info->sechdrs, mod);
> > +	if (err < 0)
> 
> 	if (err < 0 && is_livepatch_module(mod))

Ah, right.
 
> > +		free_module_elf(mod);
> > +
> > +	return err;
> > }
> 
> Also we need to free the copied stuff in load_module() when
> anything called after post_relocation() fails. I think
> that the following would work:
> 
> --- a/kernel/module.c
> +++ b/kernel/module.c
> @@ -3823,6 +3823,8 @@ static int load_module(struct load_info *info, const char __user *uargs,
>  	kfree(mod->args);
>   free_arch_cleanup:
>  	module_arch_cleanup(mod);
> +	if (is_livepatch_module(mod))
> +		free_module_elf(mod);
>   free_modinfo:
>  	free_modinfo(mod);
>   free_unload:

Yes, we need to free it somewhere and I missed it. free_arch_cleanup seems 
to be the correct place.
 
> But I suggest to just move copy_module_elf() up and keep
> calling it from load_module() directly. It would make
> the error handling more clear.

Unfortunately it is not that simple. arm64's module_finalize() uses 
mod->klp_info with the patch, so copy_module_elf() must be called before. 
We could move module_finalize() from post_relocation() to load_module() 
and place copy_module_elf() between those two, but I don't know. That's up 
to Jessica.

Miroslav

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH] arm64/module: use mod->klp_info section header information
  2018-10-25  9:00       ` Miroslav Benes
@ 2018-10-25 11:42         ` Jessica Yu
  0 siblings, 0 replies; 48+ messages in thread
From: Jessica Yu @ 2018-10-25 11:42 UTC (permalink / raw)
  To: Miroslav Benes
  Cc: Petr Mladek, Torsten Duwe, Will Deacon, Catalin Marinas,
	Julien Thierry, Steven Rostedt, Josh Poimboeuf, Ingo Molnar,
	Ard Biesheuvel, Arnd Bergmann, AKASHI Takahiro, linux-arm-kernel,
	linux-kernel, live-patching

+++ Miroslav Benes [25/10/18 11:00 +0200]:
>On Thu, 25 Oct 2018, Petr Mladek wrote:
>
>> On Tue 2018-10-23 19:55:54, Jessica Yu wrote:
>> > The arm64 module loader keeps a pointer into info->sechdrs to keep track
>> > of section header information for .plt section(s). A pointer to the
>> > relevent section header (struct elf64_shdr) in info->sechdrs is stored
>> > in mod->arch.{init,core}.plt. This pointer may be accessed while
>> > applying relocations in apply_relocate_add() for example. And unlike
>> > normal modules, livepatch modules can call apply_relocate_add() after
>> > module load. But the info struct (and therefore info->sechdrs) gets
>> > freed at the end of load_module() and so mod->arch.{init,core}.plt
>> > becomes an invalid pointer after the module is done loading.
>> >
>> > Luckily, livepatch modules already keep a copy of Elf section header
>> > information in mod->klp_info. So make sure livepatch modules on arm64
>> > have access to the section headers in klp_info and set
>> > mod->arch.{init,core}.plt to the appropriate section header in
>> > mod->klp_info so that they can call apply_relocate_add() even after
>> > module load.
>> >
>> > diff --git a/kernel/module.c b/kernel/module.c
>> > index f475f30eed8c..f3ac04cc9fc3 100644
>> > --- a/kernel/module.c
>> > +++ b/kernel/module.c
>> > @@ -3367,6 +3367,8 @@ int __weak module_finalize(const Elf_Ehdr *hdr,
>> >
>> > static int post_relocation(struct module *mod, const struct load_info *info)
>> > {
>> > +	int err;
>> > +
>> > 	/* Sort exception table now relocations are done. */
>> > 	sort_extable(mod->extable, mod->extable + mod->num_exentries);
>> >
>> > @@ -3377,8 +3379,18 @@ static int post_relocation(struct module *mod, const struct load_info *info)
>> > 	/* Setup kallsyms-specific fields. */
>> > 	add_kallsyms(mod, info);
>> >
>> > +	if (is_livepatch_module(mod)) {
>> > +		err = copy_module_elf(mod, info);
>> > +		if (err < 0)
>> > +			return err;
>> > +	}
>> > +
>> > 	/* Arch-specific module finalizing. */
>> > -	return module_finalize(info->hdr, info->sechdrs, mod);
>> > +	err = module_finalize(info->hdr, info->sechdrs, mod);
>> > +	if (err < 0)
>>
>> 	if (err < 0 && is_livepatch_module(mod))
>
>Ah, right.
>
>> > +		free_module_elf(mod);
>> > +
>> > +	return err;
>> > }
>>
>> Also we need to free the copied stuff in load_module() when
>> anything called after post_relocation() fails. I think
>> that the following would work:
>>
>> --- a/kernel/module.c
>> +++ b/kernel/module.c
>> @@ -3823,6 +3823,8 @@ static int load_module(struct load_info *info, const char __user *uargs,
>>  	kfree(mod->args);
>>   free_arch_cleanup:
>>  	module_arch_cleanup(mod);
>> +	if (is_livepatch_module(mod))
>> +		free_module_elf(mod);
>>   free_modinfo:
>>  	free_modinfo(mod);
>>   free_unload:
>
>Yes, we need to free it somewhere and I missed it. free_arch_cleanup seems
>to be the correct place.

Good catches, thank you both!

>> But I suggest to just move copy_module_elf() up and keep
>> calling it from load_module() directly. It would make
>> the error handling more clear.
>
>Unfortunately it is not that simple. arm64's module_finalize() uses
>mod->klp_info with the patch, so copy_module_elf() must be called before.
>We could move module_finalize() from post_relocation() to load_module()
>and place copy_module_elf() between those two, but I don't know. That's up
>to Jessica.

Yeah, it's a stylistic preference - will shuffle those calls around
and see what looks best. v2 to come shortly.

Thanks!

Jessica

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH v2] arm64/module: use mod->klp_info section header information for livepatch modules
  2018-10-23 17:55   ` [PATCH] arm64/module: use mod->klp_info section header information Jessica Yu
                       ` (2 preceding siblings ...)
  2018-10-25  8:08     ` Petr Mladek
@ 2018-10-26 17:25     ` Jessica Yu
  2018-10-29 13:24       ` Miroslav Benes
  2018-10-29 15:28       ` Will Deacon
  3 siblings, 2 replies; 48+ messages in thread
From: Jessica Yu @ 2018-10-26 17:25 UTC (permalink / raw)
  To: Torsten Duwe
  Cc: Will Deacon, Catalin Marinas, Julien Thierry, Steven Rostedt,
	Josh Poimboeuf, Ingo Molnar, Ard Biesheuvel, Arnd Bergmann,
	AKASHI Takahiro, Miroslav Benes, Petr Mladek, linux-arm-kernel,
	linux-kernel, live-patching

The arm64 module loader keeps a pointer into info->sechdrs to keep track
of section header information for .plt section(s). A pointer to the
relevent section header (struct elf64_shdr) in info->sechdrs is stored
in mod->arch.{init,core}.plt. This pointer may be accessed while
applying relocations in apply_relocate_add() for example. And unlike
normal modules, livepatch modules can call apply_relocate_add() after
module load. But the info struct (and therefore info->sechdrs) gets
freed at the end of load_module() and so mod->arch.{init,core}.plt
becomes an invalid pointer after the module is done loading.

Luckily, livepatch modules already keep a copy of Elf section header
information in mod->klp_info. So make sure livepatch modules on arm64
have access to the section headers in klp_info and set
mod->arch.{init,core}.plt to the appropriate section header in
mod->klp_info so that they can call apply_relocate_add() even after
module load.

Signed-off-by: Jessica Yu <jeyu@kernel.org>
---

v2:
  - fix missing free_module_elf() in error path
  - move copy_module_elf() and module_finalize() out of post_relocation()
    to make error handling more clear
  - add braces to if-else block in arm64 module_frob_arch_sections()

 arch/arm64/include/asm/module.h |  1 +
 arch/arm64/kernel/module-plts.c | 17 ++++++++++++-----
 arch/arm64/kernel/module.c      | 10 ++++++++++
 kernel/module.c                 | 29 +++++++++++++++--------------
 4 files changed, 38 insertions(+), 19 deletions(-)

diff --git a/arch/arm64/include/asm/module.h b/arch/arm64/include/asm/module.h
index fef773c94e9d..ac9b97f9ae5e 100644
--- a/arch/arm64/include/asm/module.h
+++ b/arch/arm64/include/asm/module.h
@@ -25,6 +25,7 @@ struct mod_plt_sec {
 	struct elf64_shdr	*plt;
 	int			plt_num_entries;
 	int			plt_max_entries;
+	int			plt_shndx;
 };
 
 struct mod_arch_specific {
diff --git a/arch/arm64/kernel/module-plts.c b/arch/arm64/kernel/module-plts.c
index f0690c2ca3e0..851311ffd427 100644
--- a/arch/arm64/kernel/module-plts.c
+++ b/arch/arm64/kernel/module-plts.c
@@ -210,16 +210,23 @@ int module_frob_arch_sections(Elf_Ehdr *ehdr, Elf_Shdr *sechdrs,
 	 * entries. Record the symtab address as well.
 	 */
 	for (i = 0; i < ehdr->e_shnum; i++) {
-		if (!strcmp(secstrings + sechdrs[i].sh_name, ".plt"))
+		if (!strcmp(secstrings + sechdrs[i].sh_name, ".plt")) {
 			mod->arch.core.plt = sechdrs + i;
-		else if (!strcmp(secstrings + sechdrs[i].sh_name, ".init.plt"))
+			/*
+			 * Keep the section index for the .plt section for
+			 * livepatching. Note that .init.plt is irrelevant to
+			 * livepatch, so only the shndx for .plt is saved.
+			 */
+			mod->arch.core.plt_shndx = i;
+		} else if (!strcmp(secstrings + sechdrs[i].sh_name, ".init.plt")) {
 			mod->arch.init.plt = sechdrs + i;
-		else if (IS_ENABLED(CONFIG_DYNAMIC_FTRACE) &&
+		} else if (IS_ENABLED(CONFIG_DYNAMIC_FTRACE) &&
 			 !strcmp(secstrings + sechdrs[i].sh_name,
-				 ".text.ftrace_trampoline"))
+				 ".text.ftrace_trampoline")) {
 			tramp = sechdrs + i;
-		else if (sechdrs[i].sh_type == SHT_SYMTAB)
+		} else if (sechdrs[i].sh_type == SHT_SYMTAB) {
 			syms = (Elf64_Sym *)sechdrs[i].sh_addr;
+		}
 	}
 
 	if (!mod->arch.core.plt || !mod->arch.init.plt) {
diff --git a/arch/arm64/kernel/module.c b/arch/arm64/kernel/module.c
index dd23655fda3a..490e56070a7e 100644
--- a/arch/arm64/kernel/module.c
+++ b/arch/arm64/kernel/module.c
@@ -461,5 +461,15 @@ int module_finalize(const Elf_Ehdr *hdr,
 #endif
 	}
 
+#ifdef CONFIG_LIVEPATCH
+	/*
+	 * For livepatching, switch to the saved section header info for .plt
+	 * stored in mod->klp_info. This is needed so that livepatch is able to
+	 * call apply_relocate_add() after patch module load.
+	 */
+	if (is_livepatch_module(me))
+		me->arch.core.plt = me->klp_info->sechdrs + me->arch.core.plt_shndx;
+#endif
+
 	return 0;
 }
diff --git a/kernel/module.c b/kernel/module.c
index f475f30eed8c..611f4fe64370 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -3365,7 +3365,7 @@ int __weak module_finalize(const Elf_Ehdr *hdr,
 	return 0;
 }
 
-static int post_relocation(struct module *mod, const struct load_info *info)
+static void post_relocation(struct module *mod, const struct load_info *info)
 {
 	/* Sort exception table now relocations are done. */
 	sort_extable(mod->extable, mod->extable + mod->num_exentries);
@@ -3376,9 +3376,6 @@ static int post_relocation(struct module *mod, const struct load_info *info)
 
 	/* Setup kallsyms-specific fields. */
 	add_kallsyms(mod, info);
-
-	/* Arch-specific module finalizing. */
-	return module_finalize(info->hdr, info->sechdrs, mod);
 }
 
 /* Is this module of this name done loading?  No locks held. */
@@ -3726,9 +3723,18 @@ static int load_module(struct load_info *info, const char __user *uargs,
 	if (err < 0)
 		goto free_modinfo;
 
-	err = post_relocation(mod, info);
+	post_relocation(mod, info);
+
+	if (is_livepatch_module(mod)) {
+		err = copy_module_elf(mod, info);
+		if (err < 0)
+			goto free_modinfo;
+	}
+
+	/* Arch-specific module finalizing. */
+	err = module_finalize(info->hdr, info->sechdrs, mod);
 	if (err < 0)
-		goto free_modinfo;
+		goto free_module_elf;
 
 	flush_module_icache(mod);
 
@@ -3770,12 +3776,6 @@ static int load_module(struct load_info *info, const char __user *uargs,
 	if (err < 0)
 		goto coming_cleanup;
 
-	if (is_livepatch_module(mod)) {
-		err = copy_module_elf(mod, info);
-		if (err < 0)
-			goto sysfs_cleanup;
-	}
-
 	/* Get rid of temporary copy. */
 	free_copy(info);
 
@@ -3784,8 +3784,6 @@ static int load_module(struct load_info *info, const char __user *uargs,
 
 	return do_init_module(mod);
 
- sysfs_cleanup:
-	mod_sysfs_teardown(mod);
  coming_cleanup:
 	mod->state = MODULE_STATE_GOING;
 	destroy_params(mod->kp, mod->num_kp);
@@ -3809,6 +3807,9 @@ static int load_module(struct load_info *info, const char __user *uargs,
 	kfree(mod->args);
  free_arch_cleanup:
 	module_arch_cleanup(mod);
+ free_module_elf:
+	if (is_livepatch_module(mod))
+		free_module_elf(mod);
  free_modinfo:
 	free_modinfo(mod);
  free_unload:
-- 
2.16.4


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2] arm64/module: use mod->klp_info section header information for livepatch modules
  2018-10-26 17:25     ` [PATCH v2] arm64/module: use mod->klp_info section header information for livepatch modules Jessica Yu
@ 2018-10-29 13:24       ` Miroslav Benes
  2018-10-29 13:32         ` Jessica Yu
  2018-10-29 15:28       ` Will Deacon
  1 sibling, 1 reply; 48+ messages in thread
From: Miroslav Benes @ 2018-10-29 13:24 UTC (permalink / raw)
  To: Jessica Yu
  Cc: Torsten Duwe, Will Deacon, Catalin Marinas, Julien Thierry,
	Steven Rostedt, Josh Poimboeuf, Ingo Molnar, Ard Biesheuvel,
	Arnd Bergmann, AKASHI Takahiro, Petr Mladek, linux-arm-kernel,
	linux-kernel, live-patching


> diff --git a/arch/arm64/kernel/module.c b/arch/arm64/kernel/module.c
> index dd23655fda3a..490e56070a7e 100644
> --- a/arch/arm64/kernel/module.c
> +++ b/arch/arm64/kernel/module.c
> @@ -461,5 +461,15 @@ int module_finalize(const Elf_Ehdr *hdr,
> #endif
> 	}
> 
> +#ifdef CONFIG_LIVEPATCH
> +	/*
> +	 * For livepatching, switch to the saved section header info for .plt
> +	 * stored in mod->klp_info. This is needed so that livepatch is able to
> +	 * call apply_relocate_add() after patch module load.
> +	 */
> +	if (is_livepatch_module(me))
> +		me->arch.core.plt = me->klp_info->sechdrs + me->arch.core.plt_shndx;
> +#endif

I missed it before, but the hunk should be under "#ifdef 
CONFIG_ARM64_MODULE_PLTS" protection similarly to ftrace_trampoline just 
above. me->arch.core.plt may not exist otherwise.

Miroslav

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2] arm64/module: use mod->klp_info section header information for livepatch modules
  2018-10-29 13:24       ` Miroslav Benes
@ 2018-10-29 13:32         ` Jessica Yu
  0 siblings, 0 replies; 48+ messages in thread
From: Jessica Yu @ 2018-10-29 13:32 UTC (permalink / raw)
  To: Miroslav Benes
  Cc: Torsten Duwe, Will Deacon, Catalin Marinas, Julien Thierry,
	Steven Rostedt, Josh Poimboeuf, Ingo Molnar, Ard Biesheuvel,
	Arnd Bergmann, AKASHI Takahiro, Petr Mladek, linux-arm-kernel,
	linux-kernel, live-patching

+++ Miroslav Benes [29/10/18 14:24 +0100]:
>
>> diff --git a/arch/arm64/kernel/module.c b/arch/arm64/kernel/module.c
>> index dd23655fda3a..490e56070a7e 100644
>> --- a/arch/arm64/kernel/module.c
>> +++ b/arch/arm64/kernel/module.c
>> @@ -461,5 +461,15 @@ int module_finalize(const Elf_Ehdr *hdr,
>> #endif
>> 	}
>>
>> +#ifdef CONFIG_LIVEPATCH
>> +	/*
>> +	 * For livepatching, switch to the saved section header info for .plt
>> +	 * stored in mod->klp_info. This is needed so that livepatch is able to
>> +	 * call apply_relocate_add() after patch module load.
>> +	 */
>> +	if (is_livepatch_module(me))
>> +		me->arch.core.plt = me->klp_info->sechdrs + me->arch.core.plt_shndx;
>> +#endif
>
>I missed it before, but the hunk should be under "#ifdef
>CONFIG_ARM64_MODULE_PLTS" protection similarly to ftrace_trampoline just
>above. me->arch.core.plt may not exist otherwise.

Gah! Yes you are right, will fix.

Thanks,

Jessica

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2] arm64/module: use mod->klp_info section header information for livepatch modules
  2018-10-26 17:25     ` [PATCH v2] arm64/module: use mod->klp_info section header information for livepatch modules Jessica Yu
  2018-10-29 13:24       ` Miroslav Benes
@ 2018-10-29 15:28       ` Will Deacon
  2018-10-30 13:19         ` Jessica Yu
  1 sibling, 1 reply; 48+ messages in thread
From: Will Deacon @ 2018-10-29 15:28 UTC (permalink / raw)
  To: Jessica Yu
  Cc: Torsten Duwe, Catalin Marinas, Julien Thierry, Steven Rostedt,
	Josh Poimboeuf, Ingo Molnar, Ard Biesheuvel, Arnd Bergmann,
	AKASHI Takahiro, Miroslav Benes, Petr Mladek, linux-arm-kernel,
	linux-kernel, live-patching

Hi Jessica,

On Fri, Oct 26, 2018 at 07:25:01PM +0200, Jessica Yu wrote:
> The arm64 module loader keeps a pointer into info->sechdrs to keep track
> of section header information for .plt section(s). A pointer to the
> relevent section header (struct elf64_shdr) in info->sechdrs is stored
> in mod->arch.{init,core}.plt. This pointer may be accessed while
> applying relocations in apply_relocate_add() for example. And unlike
> normal modules, livepatch modules can call apply_relocate_add() after
> module load. But the info struct (and therefore info->sechdrs) gets
> freed at the end of load_module() and so mod->arch.{init,core}.plt
> becomes an invalid pointer after the module is done loading.
> 
> Luckily, livepatch modules already keep a copy of Elf section header
> information in mod->klp_info. So make sure livepatch modules on arm64
> have access to the section headers in klp_info and set
> mod->arch.{init,core}.plt to the appropriate section header in
> mod->klp_info so that they can call apply_relocate_add() even after
> module load.
> 
> Signed-off-by: Jessica Yu <jeyu@kernel.org>
> ---
> 
> v2:
>  - fix missing free_module_elf() in error path
>  - move copy_module_elf() and module_finalize() out of post_relocation()
>    to make error handling more clear
>  - add braces to if-else block in arm64 module_frob_arch_sections()
> 
> arch/arm64/include/asm/module.h |  1 +
> arch/arm64/kernel/module-plts.c | 17 ++++++++++++-----
> arch/arm64/kernel/module.c      | 10 ++++++++++
> kernel/module.c                 | 29 +++++++++++++++--------------
> 4 files changed, 38 insertions(+), 19 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/module.h b/arch/arm64/include/asm/module.h
> index fef773c94e9d..ac9b97f9ae5e 100644
> --- a/arch/arm64/include/asm/module.h
> +++ b/arch/arm64/include/asm/module.h
> @@ -25,6 +25,7 @@ struct mod_plt_sec {
> 	struct elf64_shdr	*plt;
> 	int			plt_num_entries;
> 	int			plt_max_entries;
> +	int			plt_shndx;
> };

Does this mean we can drop the plt pointer from this struct altogether, and
simply offset into the section headers when applying the relocations?

Cheers,

Will

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2] arm64/module: use mod->klp_info section header information for livepatch modules
  2018-10-29 15:28       ` Will Deacon
@ 2018-10-30 13:19         ` Jessica Yu
  2018-11-01 15:18           ` Miroslav Benes
  2018-11-01 16:07           ` Will Deacon
  0 siblings, 2 replies; 48+ messages in thread
From: Jessica Yu @ 2018-10-30 13:19 UTC (permalink / raw)
  To: Will Deacon
  Cc: Torsten Duwe, Catalin Marinas, Julien Thierry, Steven Rostedt,
	Josh Poimboeuf, Ingo Molnar, Ard Biesheuvel, Arnd Bergmann,
	AKASHI Takahiro, Miroslav Benes, Petr Mladek, linux-arm-kernel,
	linux-kernel, live-patching

+++ Will Deacon [29/10/18 15:28 +0000]:
>Hi Jessica,
>
>On Fri, Oct 26, 2018 at 07:25:01PM +0200, Jessica Yu wrote:
>> The arm64 module loader keeps a pointer into info->sechdrs to keep track
>> of section header information for .plt section(s). A pointer to the
>> relevent section header (struct elf64_shdr) in info->sechdrs is stored
>> in mod->arch.{init,core}.plt. This pointer may be accessed while
>> applying relocations in apply_relocate_add() for example. And unlike
>> normal modules, livepatch modules can call apply_relocate_add() after
>> module load. But the info struct (and therefore info->sechdrs) gets
>> freed at the end of load_module() and so mod->arch.{init,core}.plt
>> becomes an invalid pointer after the module is done loading.
>>
>> Luckily, livepatch modules already keep a copy of Elf section header
>> information in mod->klp_info. So make sure livepatch modules on arm64
>> have access to the section headers in klp_info and set
>> mod->arch.{init,core}.plt to the appropriate section header in
>> mod->klp_info so that they can call apply_relocate_add() even after
>> module load.
>>
>> Signed-off-by: Jessica Yu <jeyu@kernel.org>
>> ---
>>
>> v2:
>>  - fix missing free_module_elf() in error path
>>  - move copy_module_elf() and module_finalize() out of post_relocation()
>>    to make error handling more clear
>>  - add braces to if-else block in arm64 module_frob_arch_sections()
>>
>> arch/arm64/include/asm/module.h |  1 +
>> arch/arm64/kernel/module-plts.c | 17 ++++++++++++-----
>> arch/arm64/kernel/module.c      | 10 ++++++++++
>> kernel/module.c                 | 29 +++++++++++++++--------------
>> 4 files changed, 38 insertions(+), 19 deletions(-)
>>
>> diff --git a/arch/arm64/include/asm/module.h b/arch/arm64/include/asm/module.h
>> index fef773c94e9d..ac9b97f9ae5e 100644
>> --- a/arch/arm64/include/asm/module.h
>> +++ b/arch/arm64/include/asm/module.h
>> @@ -25,6 +25,7 @@ struct mod_plt_sec {
>> 	struct elf64_shdr	*plt;
>> 	int			plt_num_entries;
>> 	int			plt_max_entries;
>> +	int			plt_shndx;
>> };
>
>Does this mean we can drop the plt pointer from this struct altogether, and
>simply offset into the section headers when applying the relocations?

Hmm, if everyone is OK with dropping the plt pointer from struct
mod_plt_sec, then I think we can simplify this patch even further.

With the plt shndx saved, we can additionally pass a pointer to
sechdrs to module_emit_plt_entry(), and with that just offset into the
section headers as you suggest. Since livepatch *already* passes in
the correct copy of the section headers (mod->klp_info->sechdrs) to
apply_relocate_add(), we wouldn't even need to modify the arm64
module_finalize() to change mod->arch.core.plt to point into
mod->klp_info->sechdrs anymore and we can drop all the changes to the
module loader too.

Something like the following maybe?

diff --git a/arch/arm64/include/asm/module.h b/arch/arm64/include/asm/module.h
index fef773c94e9d..ac10fa066487 100644
--- a/arch/arm64/include/asm/module.h
+++ b/arch/arm64/include/asm/module.h
@@ -22,7 +22,7 @@
 
 #ifdef CONFIG_ARM64_MODULE_PLTS
 struct mod_plt_sec {
-	struct elf64_shdr	*plt;
+	int			plt_shndx;
 	int			plt_num_entries;
 	int			plt_max_entries;
 };
@@ -37,10 +37,12 @@ struct mod_arch_specific {
 };
 #endif
 
-u64 module_emit_plt_entry(struct module *mod, void *loc, const Elf64_Rela *rela,
+u64 module_emit_plt_entry(struct module *mod, Elf64_Shdr *sechdrs,
+			  void *loc, const Elf64_Rela *rela,
 			  Elf64_Sym *sym);
 
-u64 module_emit_veneer_for_adrp(struct module *mod, void *loc, u64 val);
+u64 module_emit_veneer_for_adrp(struct module *mod, Elf64_Shdr *sechdrs,
+				void *loc, u64 val);
 
 #ifdef CONFIG_RANDOMIZE_BASE
 extern u64 module_alloc_base;
diff --git a/arch/arm64/kernel/module-plts.c b/arch/arm64/kernel/module-plts.c
index f0690c2ca3e0..3cd744a1cbc2 100644
--- a/arch/arm64/kernel/module-plts.c
+++ b/arch/arm64/kernel/module-plts.c
@@ -16,13 +16,15 @@ static bool in_init(const struct module *mod, void *loc)
 	return (u64)loc - (u64)mod->init_layout.base < mod->init_layout.size;
 }
 
-u64 module_emit_plt_entry(struct module *mod, void *loc, const Elf64_Rela *rela,
+u64 module_emit_plt_entry(struct module *mod, Elf64_Shdr *sechdrs,
+			  void *loc, const Elf64_Rela *rela,
 			  Elf64_Sym *sym)
 {
-	struct mod_plt_sec *pltsec = !in_init(mod, loc) ? &mod->arch.core :
-							  &mod->arch.init;
-	struct plt_entry *plt = (struct plt_entry *)pltsec->plt->sh_addr;
-	int i = pltsec->plt_num_entries;
+	struct mod_plt_sec *plt_info = !in_init(mod, loc) ? &mod->arch.core :
+							    &mod->arch.init;
+	Elf64_Shdr *pltsec = sechdrs + plt_info->plt_shndx;
+	struct plt_entry *plt = (struct plt_entry *)pltsec->sh_addr;
+	int i = plt_info->plt_num_entries;
 	u64 val = sym->st_value + rela->r_addend;
 
 	plt[i] = get_plt_entry(val);
@@ -35,24 +37,26 @@ u64 module_emit_plt_entry(struct module *mod, void *loc, const Elf64_Rela *rela,
 	if (i > 0 && plt_entries_equal(plt + i, plt + i - 1))
 		return (u64)&plt[i - 1];
 
-	pltsec->plt_num_entries++;
-	if (WARN_ON(pltsec->plt_num_entries > pltsec->plt_max_entries))
+	plt_info->plt_num_entries++;
+	if (WARN_ON(plt_info->plt_num_entries > plt_info->plt_max_entries))
 		return 0;
 
 	return (u64)&plt[i];
 }
 
 #ifdef CONFIG_ARM64_ERRATUM_843419
-u64 module_emit_veneer_for_adrp(struct module *mod, void *loc, u64 val)
+u64 module_emit_veneer_for_adrp(struct module *mod, Elf64_Shdr *sechdrs,
+				void *loc, u64 val)
 {
-	struct mod_plt_sec *pltsec = !in_init(mod, loc) ? &mod->arch.core :
-							  &mod->arch.init;
-	struct plt_entry *plt = (struct plt_entry *)pltsec->plt->sh_addr;
-	int i = pltsec->plt_num_entries++;
+	struct mod_plt_sec *plt_info = !in_init(mod, loc) ? &mod->arch.core :
+							    &mod->arch.init;
+	Elf64_Shdr *pltsec = sechdrs + plt_info->plt_shndx;
+	struct plt_entry *plt = (struct plt_entry *)pltsec->sh_addr;
+	int i = plt_info->plt_num_entries++;
 	u32 mov0, mov1, mov2, br;
 	int rd;
 
-	if (WARN_ON(pltsec->plt_num_entries > pltsec->plt_max_entries))
+	if (WARN_ON(plt_info->plt_num_entries > plt_info->plt_max_entries))
 		return 0;
 
 	/* get the destination register of the ADRP instruction */
@@ -202,7 +206,7 @@ int module_frob_arch_sections(Elf_Ehdr *ehdr, Elf_Shdr *sechdrs,
 	unsigned long core_plts = 0;
 	unsigned long init_plts = 0;
 	Elf64_Sym *syms = NULL;
-	Elf_Shdr *tramp = NULL;
+	Elf_Shdr *pltsec, *tramp = NULL;
 	int i;
 
 	/*
@@ -211,9 +215,9 @@ int module_frob_arch_sections(Elf_Ehdr *ehdr, Elf_Shdr *sechdrs,
 	 */
 	for (i = 0; i < ehdr->e_shnum; i++) {
 		if (!strcmp(secstrings + sechdrs[i].sh_name, ".plt"))
-			mod->arch.core.plt = sechdrs + i;
+			mod->arch.core.plt_shndx = i;
 		else if (!strcmp(secstrings + sechdrs[i].sh_name, ".init.plt"))
-			mod->arch.init.plt = sechdrs + i;
+			mod->arch.init.plt_shndx = i;
 		else if (IS_ENABLED(CONFIG_DYNAMIC_FTRACE) &&
 			 !strcmp(secstrings + sechdrs[i].sh_name,
 				 ".text.ftrace_trampoline"))
@@ -222,7 +226,7 @@ int module_frob_arch_sections(Elf_Ehdr *ehdr, Elf_Shdr *sechdrs,
 			syms = (Elf64_Sym *)sechdrs[i].sh_addr;
 	}
 
-	if (!mod->arch.core.plt || !mod->arch.init.plt) {
+	if (!mod->arch.core.plt_shndx || !mod->arch.init.plt_shndx) {
 		pr_err("%s: module PLT section(s) missing\n", mod->name);
 		return -ENOEXEC;
 	}
@@ -254,17 +258,19 @@ int module_frob_arch_sections(Elf_Ehdr *ehdr, Elf_Shdr *sechdrs,
 						sechdrs[i].sh_info, dstsec);
 	}
 
-	mod->arch.core.plt->sh_type = SHT_NOBITS;
-	mod->arch.core.plt->sh_flags = SHF_EXECINSTR | SHF_ALLOC;
-	mod->arch.core.plt->sh_addralign = L1_CACHE_BYTES;
-	mod->arch.core.plt->sh_size = (core_plts  + 1) * sizeof(struct plt_entry);
+	pltsec = sechdrs + mod->arch.core.plt_shndx;
+	pltsec->sh_type = SHT_NOBITS;
+	pltsec->sh_flags = SHF_EXECINSTR | SHF_ALLOC;
+	pltsec->sh_addralign = L1_CACHE_BYTES;
+	pltsec->sh_size = (core_plts  + 1) * sizeof(struct plt_entry);
 	mod->arch.core.plt_num_entries = 0;
 	mod->arch.core.plt_max_entries = core_plts;
 
-	mod->arch.init.plt->sh_type = SHT_NOBITS;
-	mod->arch.init.plt->sh_flags = SHF_EXECINSTR | SHF_ALLOC;
-	mod->arch.init.plt->sh_addralign = L1_CACHE_BYTES;
-	mod->arch.init.plt->sh_size = (init_plts + 1) * sizeof(struct plt_entry);
+	pltsec = sechdrs + mod->arch.init.plt_shndx;
+	pltsec->sh_type = SHT_NOBITS;
+	pltsec->sh_flags = SHF_EXECINSTR | SHF_ALLOC;
+	pltsec->sh_addralign = L1_CACHE_BYTES;
+	pltsec->sh_size = (init_plts + 1) * sizeof(struct plt_entry);
 	mod->arch.init.plt_num_entries = 0;
 	mod->arch.init.plt_max_entries = init_plts;
 
diff --git a/arch/arm64/kernel/module.c b/arch/arm64/kernel/module.c
index dd23655fda3a..8e6444db2d8e 100644
--- a/arch/arm64/kernel/module.c
+++ b/arch/arm64/kernel/module.c
@@ -198,7 +198,8 @@ static int reloc_insn_imm(enum aarch64_reloc_op op, __le32 *place, u64 val,
 	return 0;
 }
 
-static int reloc_insn_adrp(struct module *mod, __le32 *place, u64 val)
+static int reloc_insn_adrp(struct module *mod, Elf64_Shdr *sechdrs,
+			   __le32 *place, u64 val)
 {
 	u32 insn;
 
@@ -215,7 +216,7 @@ static int reloc_insn_adrp(struct module *mod, __le32 *place, u64 val)
 		insn &= ~BIT(31);
 	} else {
 		/* out of range for ADR -> emit a veneer */
-		val = module_emit_veneer_for_adrp(mod, place, val & ~0xfff);
+		val = module_emit_veneer_for_adrp(mod, sechdrs, place, val & ~0xfff);
 		if (!val)
 			return -ENOEXEC;
 		insn = aarch64_insn_gen_branch_imm((u64)place, val,
@@ -368,7 +369,7 @@ int apply_relocate_add(Elf64_Shdr *sechdrs,
 		case R_AARCH64_ADR_PREL_PG_HI21_NC:
 			overflow_check = false;
 		case R_AARCH64_ADR_PREL_PG_HI21:
-			ovf = reloc_insn_adrp(me, loc, val);
+			ovf = reloc_insn_adrp(me, sechdrs, loc, val);
 			if (ovf && ovf != -ERANGE)
 				return ovf;
 			break;
@@ -413,7 +414,7 @@ int apply_relocate_add(Elf64_Shdr *sechdrs,
 
 			if (IS_ENABLED(CONFIG_ARM64_MODULE_PLTS) &&
 			    ovf == -ERANGE) {
-				val = module_emit_plt_entry(me, loc, &rel[i], sym);
+				val = module_emit_plt_entry(me, sechdrs, loc, &rel[i], sym);
 				if (!val)
 					return -ENOEXEC;
 				ovf = reloc_insn_imm(RELOC_OP_PREL, loc, val, 2,

Perhaps this approach is better. Miroslav and Petr, do you think this
would work? (Apologies for the efforts to review the last two
versions, if we end up scrapping the old patch :-/)

Thanks,

Jessica

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2] arm64/module: use mod->klp_info section header information for livepatch modules
  2018-10-30 13:19         ` Jessica Yu
@ 2018-11-01 15:18           ` Miroslav Benes
  2018-11-01 16:07           ` Will Deacon
  1 sibling, 0 replies; 48+ messages in thread
From: Miroslav Benes @ 2018-11-01 15:18 UTC (permalink / raw)
  To: Jessica Yu
  Cc: Will Deacon, Torsten Duwe, Catalin Marinas, Julien Thierry,
	Steven Rostedt, Josh Poimboeuf, Ingo Molnar, Ard Biesheuvel,
	Arnd Bergmann, AKASHI Takahiro, Petr Mladek, linux-arm-kernel,
	linux-kernel, live-patching


> >Does this mean we can drop the plt pointer from this struct altogether, and
> >simply offset into the section headers when applying the relocations?
> 
> Hmm, if everyone is OK with dropping the plt pointer from struct
> mod_plt_sec, then I think we can simplify this patch even further.
> 
> With the plt shndx saved, we can additionally pass a pointer to
> sechdrs to module_emit_plt_entry(), and with that just offset into the
> section headers as you suggest. Since livepatch *already* passes in
> the correct copy of the section headers (mod->klp_info->sechdrs) to
> apply_relocate_add(), we wouldn't even need to modify the arm64
> module_finalize() to change mod->arch.core.plt to point into
> mod->klp_info->sechdrs anymore and we can drop all the changes to the
> module loader too.
> 
> Something like the following maybe?
> 
> diff --git a/arch/arm64/include/asm/module.h b/arch/arm64/include/asm/module.h
> index fef773c94e9d..ac10fa066487 100644
> --- a/arch/arm64/include/asm/module.h
> +++ b/arch/arm64/include/asm/module.h
> @@ -22,7 +22,7 @@
> 
> #ifdef CONFIG_ARM64_MODULE_PLTS
> struct mod_plt_sec {
> -	struct elf64_shdr	*plt;
> +	int			plt_shndx;
> 	int			plt_num_entries;
> 	int			plt_max_entries;
> };
> @@ -37,10 +37,12 @@ struct mod_arch_specific {
> };
> #endif
> 
> -u64 module_emit_plt_entry(struct module *mod, void *loc, const Elf64_Rela
> *rela,
> +u64 module_emit_plt_entry(struct module *mod, Elf64_Shdr *sechdrs,
> +			  void *loc, const Elf64_Rela *rela,
> 			  Elf64_Sym *sym);
> 
> -u64 module_emit_veneer_for_adrp(struct module *mod, void *loc, u64 val);
> +u64 module_emit_veneer_for_adrp(struct module *mod, Elf64_Shdr *sechdrs,
> +				void *loc, u64 val);
> 
> #ifdef CONFIG_RANDOMIZE_BASE
> extern u64 module_alloc_base;
> diff --git a/arch/arm64/kernel/module-plts.c b/arch/arm64/kernel/module-plts.c
> index f0690c2ca3e0..3cd744a1cbc2 100644
> --- a/arch/arm64/kernel/module-plts.c
> +++ b/arch/arm64/kernel/module-plts.c
> @@ -16,13 +16,15 @@ static bool in_init(const struct module *mod, void *loc)
> 	return (u64)loc - (u64)mod->init_layout.base < mod->init_layout.size;
> }
> 
> -u64 module_emit_plt_entry(struct module *mod, void *loc, const Elf64_Rela
> *rela,
> +u64 module_emit_plt_entry(struct module *mod, Elf64_Shdr *sechdrs,
> +			  void *loc, const Elf64_Rela *rela,
> 			  Elf64_Sym *sym)
> {
> -	struct mod_plt_sec *pltsec = !in_init(mod, loc) ? &mod->arch.core :
> -							  &mod->arch.init;
> -	struct plt_entry *plt = (struct plt_entry *)pltsec->plt->sh_addr;
> -	int i = pltsec->plt_num_entries;
> +	struct mod_plt_sec *plt_info = !in_init(mod, loc) ? &mod->arch.core :
> +							    &mod->arch.init;
> +	Elf64_Shdr *pltsec = sechdrs + plt_info->plt_shndx;
> +	struct plt_entry *plt = (struct plt_entry *)pltsec->sh_addr;
> +	int i = plt_info->plt_num_entries;
> 	u64 val = sym->st_value + rela->r_addend;
> 
> 	plt[i] = get_plt_entry(val);
> @@ -35,24 +37,26 @@ u64 module_emit_plt_entry(struct module *mod, void *loc,
> const Elf64_Rela *rela,
> 	if (i > 0 && plt_entries_equal(plt + i, plt + i - 1))
> 		return (u64)&plt[i - 1];
> 
> -	pltsec->plt_num_entries++;
> -	if (WARN_ON(pltsec->plt_num_entries > pltsec->plt_max_entries))
> +	plt_info->plt_num_entries++;
> +	if (WARN_ON(plt_info->plt_num_entries > plt_info->plt_max_entries))
> 		return 0;
> 
> 	return (u64)&plt[i];
> }
> 
> #ifdef CONFIG_ARM64_ERRATUM_843419
> -u64 module_emit_veneer_for_adrp(struct module *mod, void *loc, u64 val)
> +u64 module_emit_veneer_for_adrp(struct module *mod, Elf64_Shdr *sechdrs,
> +				void *loc, u64 val)
> {
> -	struct mod_plt_sec *pltsec = !in_init(mod, loc) ? &mod->arch.core :
> -							  &mod->arch.init;
> -	struct plt_entry *plt = (struct plt_entry *)pltsec->plt->sh_addr;
> -	int i = pltsec->plt_num_entries++;
> +	struct mod_plt_sec *plt_info = !in_init(mod, loc) ? &mod->arch.core :
> +							    &mod->arch.init;
> +	Elf64_Shdr *pltsec = sechdrs + plt_info->plt_shndx;
> +	struct plt_entry *plt = (struct plt_entry *)pltsec->sh_addr;
> +	int i = plt_info->plt_num_entries++;
> 	u32 mov0, mov1, mov2, br;
> 	int rd;
> 
> -	if (WARN_ON(pltsec->plt_num_entries > pltsec->plt_max_entries))
> +	if (WARN_ON(plt_info->plt_num_entries > plt_info->plt_max_entries))
> 		return 0;
> 
> 	/* get the destination register of the ADRP instruction */
> @@ -202,7 +206,7 @@ int module_frob_arch_sections(Elf_Ehdr *ehdr, Elf_Shdr
> *sechdrs,
> 	unsigned long core_plts = 0;
> 	unsigned long init_plts = 0;
> 	Elf64_Sym *syms = NULL;
> -	Elf_Shdr *tramp = NULL;
> +	Elf_Shdr *pltsec, *tramp = NULL;
> 	int i;
> 
> 	/*
> @@ -211,9 +215,9 @@ int module_frob_arch_sections(Elf_Ehdr *ehdr, Elf_Shdr
> *sechdrs,
> 	*/
> 	for (i = 0; i < ehdr->e_shnum; i++) {
> 		if (!strcmp(secstrings + sechdrs[i].sh_name, ".plt"))
> -			mod->arch.core.plt = sechdrs + i;
> +			mod->arch.core.plt_shndx = i;
> 		else if (!strcmp(secstrings + sechdrs[i].sh_name,
> ".init.plt"))
> -			mod->arch.init.plt = sechdrs + i;
> +			mod->arch.init.plt_shndx = i;
> 		else if (IS_ENABLED(CONFIG_DYNAMIC_FTRACE) &&
> 			 !strcmp(secstrings + sechdrs[i].sh_name,
> 				 ".text.ftrace_trampoline"))
> @@ -222,7 +226,7 @@ int module_frob_arch_sections(Elf_Ehdr *ehdr, Elf_Shdr
> *sechdrs,
> 			syms = (Elf64_Sym *)sechdrs[i].sh_addr;
> 	}
> 
> -	if (!mod->arch.core.plt || !mod->arch.init.plt) {
> +	if (!mod->arch.core.plt_shndx || !mod->arch.init.plt_shndx) {
> 		pr_err("%s: module PLT section(s) missing\n", mod->name);
> 		return -ENOEXEC;
> 	}
> @@ -254,17 +258,19 @@ int module_frob_arch_sections(Elf_Ehdr *ehdr, Elf_Shdr
> *sechdrs,
> 						sechdrs[i].sh_info, dstsec);
> 	}
> 
> -	mod->arch.core.plt->sh_type = SHT_NOBITS;
> -	mod->arch.core.plt->sh_flags = SHF_EXECINSTR | SHF_ALLOC;
> -	mod->arch.core.plt->sh_addralign = L1_CACHE_BYTES;
> -	mod->arch.core.plt->sh_size = (core_plts  + 1) * sizeof(struct
> plt_entry);
> +	pltsec = sechdrs + mod->arch.core.plt_shndx;
> +	pltsec->sh_type = SHT_NOBITS;
> +	pltsec->sh_flags = SHF_EXECINSTR | SHF_ALLOC;
> +	pltsec->sh_addralign = L1_CACHE_BYTES;
> +	pltsec->sh_size = (core_plts  + 1) * sizeof(struct plt_entry);
> 	mod->arch.core.plt_num_entries = 0;
> 	mod->arch.core.plt_max_entries = core_plts;
> 
> -	mod->arch.init.plt->sh_type = SHT_NOBITS;
> -	mod->arch.init.plt->sh_flags = SHF_EXECINSTR | SHF_ALLOC;
> -	mod->arch.init.plt->sh_addralign = L1_CACHE_BYTES;
> -	mod->arch.init.plt->sh_size = (init_plts + 1) * sizeof(struct
> plt_entry);
> +	pltsec = sechdrs + mod->arch.init.plt_shndx;
> +	pltsec->sh_type = SHT_NOBITS;
> +	pltsec->sh_flags = SHF_EXECINSTR | SHF_ALLOC;
> +	pltsec->sh_addralign = L1_CACHE_BYTES;
> +	pltsec->sh_size = (init_plts + 1) * sizeof(struct plt_entry);
> 	mod->arch.init.plt_num_entries = 0;
> 	mod->arch.init.plt_max_entries = init_plts;
> 
> diff --git a/arch/arm64/kernel/module.c b/arch/arm64/kernel/module.c
> index dd23655fda3a..8e6444db2d8e 100644
> --- a/arch/arm64/kernel/module.c
> +++ b/arch/arm64/kernel/module.c
> @@ -198,7 +198,8 @@ static int reloc_insn_imm(enum aarch64_reloc_op op, __le32
> *place, u64 val,
> 	return 0;
> }
> 
> -static int reloc_insn_adrp(struct module *mod, __le32 *place, u64 val)
> +static int reloc_insn_adrp(struct module *mod, Elf64_Shdr *sechdrs,
> +			   __le32 *place, u64 val)
> {
> 	u32 insn;
> 
> @@ -215,7 +216,7 @@ static int reloc_insn_adrp(struct module *mod, __le32
> *place, u64 val)
> 		insn &= ~BIT(31);
> 	} else {
> 		/* out of range for ADR -> emit a veneer */
> -		val = module_emit_veneer_for_adrp(mod, place, val & ~0xfff);
> +		val = module_emit_veneer_for_adrp(mod, sechdrs, place, val &
> ~0xfff);
> 		if (!val)
> 			return -ENOEXEC;
> 		insn = aarch64_insn_gen_branch_imm((u64)place, val,
> @@ -368,7 +369,7 @@ int apply_relocate_add(Elf64_Shdr *sechdrs,
> 		case R_AARCH64_ADR_PREL_PG_HI21_NC:
> 			overflow_check = false;
> 		case R_AARCH64_ADR_PREL_PG_HI21:
> -			ovf = reloc_insn_adrp(me, loc, val);
> +			ovf = reloc_insn_adrp(me, sechdrs, loc, val);
> 			if (ovf && ovf != -ERANGE)
> 				return ovf;
> 			break;
> @@ -413,7 +414,7 @@ int apply_relocate_add(Elf64_Shdr *sechdrs,
> 
> 			if (IS_ENABLED(CONFIG_ARM64_MODULE_PLTS) &&
> 			    ovf == -ERANGE) {
> -				val = module_emit_plt_entry(me, loc, &rel[i],
> sym);
> +				val = module_emit_plt_entry(me, sechdrs, loc,
> &rel[i], sym);
> 				if (!val)
> 					return -ENOEXEC;
> 				ovf = reloc_insn_imm(RELOC_OP_PREL, loc, val,
> 				2,
> 
> Perhaps this approach is better. Miroslav and Petr, do you think this
> would work? (Apologies for the efforts to review the last two
> versions, if we end up scrapping the old patch :-/)

No problem. I think it should work and it looks good to me (I did not 
compile it though). I'm glad we don't have to touch load_module(). The 
function is complicated enough.

Thanks,
Miroslav

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2] arm64/module: use mod->klp_info section header information for livepatch modules
  2018-10-30 13:19         ` Jessica Yu
  2018-11-01 15:18           ` Miroslav Benes
@ 2018-11-01 16:07           ` Will Deacon
  2018-11-05 12:30             ` Ard Biesheuvel
  1 sibling, 1 reply; 48+ messages in thread
From: Will Deacon @ 2018-11-01 16:07 UTC (permalink / raw)
  To: Jessica Yu
  Cc: Torsten Duwe, Catalin Marinas, Julien Thierry, Steven Rostedt,
	Josh Poimboeuf, Ingo Molnar, Ard Biesheuvel, Arnd Bergmann,
	AKASHI Takahiro, Miroslav Benes, Petr Mladek, linux-arm-kernel,
	linux-kernel, live-patching

Hello, Jessica,

On Tue, Oct 30, 2018 at 02:19:10PM +0100, Jessica Yu wrote:
> +++ Will Deacon [29/10/18 15:28 +0000]:
> >On Fri, Oct 26, 2018 at 07:25:01PM +0200, Jessica Yu wrote:
> >>diff --git a/arch/arm64/include/asm/module.h b/arch/arm64/include/asm/module.h
> >>index fef773c94e9d..ac9b97f9ae5e 100644
> >>--- a/arch/arm64/include/asm/module.h
> >>+++ b/arch/arm64/include/asm/module.h
> >>@@ -25,6 +25,7 @@ struct mod_plt_sec {
> >>	struct elf64_shdr	*plt;
> >>	int			plt_num_entries;
> >>	int			plt_max_entries;
> >>+	int			plt_shndx;
> >>};
> >
> >Does this mean we can drop the plt pointer from this struct altogether, and
> >simply offset into the section headers when applying the relocations?
> 
> Hmm, if everyone is OK with dropping the plt pointer from struct
> mod_plt_sec, then I think we can simplify this patch even further.
> 
> With the plt shndx saved, we can additionally pass a pointer to
> sechdrs to module_emit_plt_entry(), and with that just offset into the
> section headers as you suggest. Since livepatch *already* passes in
> the correct copy of the section headers (mod->klp_info->sechdrs) to
> apply_relocate_add(), we wouldn't even need to modify the arm64
> module_finalize() to change mod->arch.core.plt to point into
> mod->klp_info->sechdrs anymore and we can drop all the changes to the
> module loader too.
> 
> Something like the following maybe?

This looks pretty good, thanks! My only (minor) objection is that the
renaming of plt_sec -> plt_info throughout makes the patch a lot more
churny than it needs to be, for questionable gain.

Anyway, it looks functionally correct and I've tested loading/unloading
the "hello world" test module with both PLTs enabled and disabled.

Acked-by: Will Deacon <will.deacon@arm.com>

Will

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2] arm64/module: use mod->klp_info section header information for livepatch modules
  2018-11-01 16:07           ` Will Deacon
@ 2018-11-05 12:30             ` Ard Biesheuvel
  0 siblings, 0 replies; 48+ messages in thread
From: Ard Biesheuvel @ 2018-11-05 12:30 UTC (permalink / raw)
  To: Will Deacon
  Cc: Jessica Yu, Torsten Duwe, Catalin Marinas, Julien Thierry,
	Steven Rostedt, Josh Poimboeuf, Ingo Molnar, Arnd Bergmann,
	AKASHI Takahiro, Miroslav Benes, Petr Mladek, linux-arm-kernel,
	Linux Kernel Mailing List, live-patching

On 1 November 2018 at 17:07, Will Deacon <will.deacon@arm.com> wrote:
> Hello, Jessica,
>
> On Tue, Oct 30, 2018 at 02:19:10PM +0100, Jessica Yu wrote:
>> +++ Will Deacon [29/10/18 15:28 +0000]:
>> >On Fri, Oct 26, 2018 at 07:25:01PM +0200, Jessica Yu wrote:
>> >>diff --git a/arch/arm64/include/asm/module.h b/arch/arm64/include/asm/module.h
>> >>index fef773c94e9d..ac9b97f9ae5e 100644
>> >>--- a/arch/arm64/include/asm/module.h
>> >>+++ b/arch/arm64/include/asm/module.h
>> >>@@ -25,6 +25,7 @@ struct mod_plt_sec {
>> >>    struct elf64_shdr       *plt;
>> >>    int                     plt_num_entries;
>> >>    int                     plt_max_entries;
>> >>+   int                     plt_shndx;
>> >>};
>> >
>> >Does this mean we can drop the plt pointer from this struct altogether, and
>> >simply offset into the section headers when applying the relocations?
>>
>> Hmm, if everyone is OK with dropping the plt pointer from struct
>> mod_plt_sec, then I think we can simplify this patch even further.
>>
>> With the plt shndx saved, we can additionally pass a pointer to
>> sechdrs to module_emit_plt_entry(), and with that just offset into the
>> section headers as you suggest. Since livepatch *already* passes in
>> the correct copy of the section headers (mod->klp_info->sechdrs) to
>> apply_relocate_add(), we wouldn't even need to modify the arm64
>> module_finalize() to change mod->arch.core.plt to point into
>> mod->klp_info->sechdrs anymore and we can drop all the changes to the
>> module loader too.
>>
>> Something like the following maybe?
>
> This looks pretty good, thanks! My only (minor) objection is that the
> renaming of plt_sec -> plt_info throughout makes the patch a lot more
> churny than it needs to be, for questionable gain.
>
> Anyway, it looks functionally correct and I've tested loading/unloading
> the "hello world" test module with both PLTs enabled and disabled.
>
> Acked-by: Will Deacon <will.deacon@arm.com>
>

For the simplified version:

Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH] arm64/module: use plt section indices for relocations
  2018-10-01 14:16 ` [PATCH v3 3/4] arm64: implement live patching Torsten Duwe
  2018-10-17 13:39   ` Miroslav Benes
  2018-10-23 17:55   ` [PATCH] arm64/module: use mod->klp_info section header information Jessica Yu
@ 2018-11-05 17:57   ` Jessica Yu
  2018-11-05 18:04     ` Ard Biesheuvel
  2018-11-05 18:53     ` [PATCH v2] " Jessica Yu
  2 siblings, 2 replies; 48+ messages in thread
From: Jessica Yu @ 2018-11-05 17:57 UTC (permalink / raw)
  To: Torsten Duwe
  Cc: Will Deacon, Catalin Marinas, Julien Thierry, Steven Rostedt,
	Josh Poimboeuf, Ingo Molnar, Ard Biesheuvel, Arnd Bergmann,
	AKASHI Takahiro, linux-arm-kernel, linux-kernel, live-patching

Instead of saving a pointer to the .plt and .init.plt sections to apply
plt-based relocations, save and use their section indices instead.

The mod->arch.{core,init}.plt pointers were problematic for livepatch
because they pointed within temporary section headers (provided by the
module loader via info->sechdrs) that would be freed after module load.
Since livepatch modules may need to apply relocations post-module-load
(for example, to patch a module that is loaded later), using section
indices to offset into the section headers (instead of accessing them
through a saved pointer) allows livepatch modules on arm64 to pass in
their own copy of the section headers to apply_relocate_add() to apply
delayed relocations.

Signed-off-by: Jessica Yu <jeyu@kernel.org>
---

Note: Addressed Will's comment about the pltsec -> plt_info rename and
removed that change to reduce unnecessary code churn. That is the only
change from the first version. I didn't include the Ack's for this reason
so let me know if this version is OK as well.

 arch/arm64/include/asm/module.h |  8 +++++---
 arch/arm64/kernel/module-plts.c | 36 ++++++++++++++++++++----------------
 arch/arm64/kernel/module.c      |  9 +++++----
 3 files changed, 30 insertions(+), 23 deletions(-)

diff --git a/arch/arm64/include/asm/module.h b/arch/arm64/include/asm/module.h
index 97d0ef12e2ff..f81c1847312d 100644
--- a/arch/arm64/include/asm/module.h
+++ b/arch/arm64/include/asm/module.h
@@ -22,7 +22,7 @@
 
 #ifdef CONFIG_ARM64_MODULE_PLTS
 struct mod_plt_sec {
-	struct elf64_shdr	*plt;
+	int			plt_shndx;
 	int			plt_num_entries;
 	int			plt_max_entries;
 };
@@ -36,10 +36,12 @@ struct mod_arch_specific {
 };
 #endif
 
-u64 module_emit_plt_entry(struct module *mod, void *loc, const Elf64_Rela *rela,
+u64 module_emit_plt_entry(struct module *mod, Elf64_Shdr *sechdrs,
+			  void *loc, const Elf64_Rela *rela,
 			  Elf64_Sym *sym);
 
-u64 module_emit_veneer_for_adrp(struct module *mod, void *loc, u64 val);
+u64 module_emit_veneer_for_adrp(struct module *mod, Elf64_Shdr *sechdrs,
+				void *loc, u64 val);
 
 #ifdef CONFIG_RANDOMIZE_BASE
 extern u64 module_alloc_base;
diff --git a/arch/arm64/kernel/module-plts.c b/arch/arm64/kernel/module-plts.c
index f0690c2ca3e0..c04b0d6abde0 100644
--- a/arch/arm64/kernel/module-plts.c
+++ b/arch/arm64/kernel/module-plts.c
@@ -16,12 +16,13 @@ static bool in_init(const struct module *mod, void *loc)
 	return (u64)loc - (u64)mod->init_layout.base < mod->init_layout.size;
 }
 
-u64 module_emit_plt_entry(struct module *mod, void *loc, const Elf64_Rela *rela,
+u64 module_emit_plt_entry(struct module *mod, Elf64_Shdr *sechdrs,
+			  void *loc, const Elf64_Rela *rela,
 			  Elf64_Sym *sym)
 {
 	struct mod_plt_sec *pltsec = !in_init(mod, loc) ? &mod->arch.core :
 							  &mod->arch.init;
-	struct plt_entry *plt = (struct plt_entry *)pltsec->plt->sh_addr;
+	struct plt_entry *plt = (struct plt_entry *)(sechdrs + pltsec->plt_shndx)->sh_addr;
 	int i = pltsec->plt_num_entries;
 	u64 val = sym->st_value + rela->r_addend;
 
@@ -43,11 +44,12 @@ u64 module_emit_plt_entry(struct module *mod, void *loc, const Elf64_Rela *rela,
 }
 
 #ifdef CONFIG_ARM64_ERRATUM_843419
-u64 module_emit_veneer_for_adrp(struct module *mod, void *loc, u64 val)
+u64 module_emit_veneer_for_adrp(struct module *mod, Elf64_Shdr *sechdrs,
+				void *loc, u64 val)
 {
 	struct mod_plt_sec *pltsec = !in_init(mod, loc) ? &mod->arch.core :
 							  &mod->arch.init;
-	struct plt_entry *plt = (struct plt_entry *)pltsec->plt->sh_addr;
+	struct plt_entry *plt = (struct plt_entry *)(sechdrs + pltsec->plt_shndx)->sh_addr;
 	int i = pltsec->plt_num_entries++;
 	u32 mov0, mov1, mov2, br;
 	int rd;
@@ -202,7 +204,7 @@ int module_frob_arch_sections(Elf_Ehdr *ehdr, Elf_Shdr *sechdrs,
 	unsigned long core_plts = 0;
 	unsigned long init_plts = 0;
 	Elf64_Sym *syms = NULL;
-	Elf_Shdr *tramp = NULL;
+	Elf_Shdr *pltsec, *tramp = NULL;
 	int i;
 
 	/*
@@ -211,9 +213,9 @@ int module_frob_arch_sections(Elf_Ehdr *ehdr, Elf_Shdr *sechdrs,
 	 */
 	for (i = 0; i < ehdr->e_shnum; i++) {
 		if (!strcmp(secstrings + sechdrs[i].sh_name, ".plt"))
-			mod->arch.core.plt = sechdrs + i;
+			mod->arch.core.plt_shndx = i;
 		else if (!strcmp(secstrings + sechdrs[i].sh_name, ".init.plt"))
-			mod->arch.init.plt = sechdrs + i;
+			mod->arch.init.plt_shndx = i;
 		else if (IS_ENABLED(CONFIG_DYNAMIC_FTRACE) &&
 			 !strcmp(secstrings + sechdrs[i].sh_name,
 				 ".text.ftrace_trampoline"))
@@ -222,7 +224,7 @@ int module_frob_arch_sections(Elf_Ehdr *ehdr, Elf_Shdr *sechdrs,
 			syms = (Elf64_Sym *)sechdrs[i].sh_addr;
 	}
 
-	if (!mod->arch.core.plt || !mod->arch.init.plt) {
+	if (!mod->arch.core.plt_shndx || !mod->arch.init.plt_shndx) {
 		pr_err("%s: module PLT section(s) missing\n", mod->name);
 		return -ENOEXEC;
 	}
@@ -254,17 +256,19 @@ int module_frob_arch_sections(Elf_Ehdr *ehdr, Elf_Shdr *sechdrs,
 						sechdrs[i].sh_info, dstsec);
 	}
 
-	mod->arch.core.plt->sh_type = SHT_NOBITS;
-	mod->arch.core.plt->sh_flags = SHF_EXECINSTR | SHF_ALLOC;
-	mod->arch.core.plt->sh_addralign = L1_CACHE_BYTES;
-	mod->arch.core.plt->sh_size = (core_plts  + 1) * sizeof(struct plt_entry);
+	pltsec = sechdrs + mod->arch.core.plt_shndx;
+	pltsec->sh_type = SHT_NOBITS;
+	pltsec->sh_flags = SHF_EXECINSTR | SHF_ALLOC;
+	pltsec->sh_addralign = L1_CACHE_BYTES;
+	pltsec->sh_size = (core_plts  + 1) * sizeof(struct plt_entry);
 	mod->arch.core.plt_num_entries = 0;
 	mod->arch.core.plt_max_entries = core_plts;
 
-	mod->arch.init.plt->sh_type = SHT_NOBITS;
-	mod->arch.init.plt->sh_flags = SHF_EXECINSTR | SHF_ALLOC;
-	mod->arch.init.plt->sh_addralign = L1_CACHE_BYTES;
-	mod->arch.init.plt->sh_size = (init_plts + 1) * sizeof(struct plt_entry);
+	pltsec = sechdrs + mod->arch.init.plt_shndx;
+	pltsec->sh_type = SHT_NOBITS;
+	pltsec->sh_flags = SHF_EXECINSTR | SHF_ALLOC;
+	pltsec->sh_addralign = L1_CACHE_BYTES;
+	pltsec->sh_size = (init_plts + 1) * sizeof(struct plt_entry);
 	mod->arch.init.plt_num_entries = 0;
 	mod->arch.init.plt_max_entries = init_plts;
 
diff --git a/arch/arm64/kernel/module.c b/arch/arm64/kernel/module.c
index f0f27aeefb73..c2abe59c6f8f 100644
--- a/arch/arm64/kernel/module.c
+++ b/arch/arm64/kernel/module.c
@@ -198,7 +198,8 @@ static int reloc_insn_imm(enum aarch64_reloc_op op, __le32 *place, u64 val,
 	return 0;
 }
 
-static int reloc_insn_adrp(struct module *mod, __le32 *place, u64 val)
+static int reloc_insn_adrp(struct module *mod, Elf64_Shdr *sechdrs,
+			   __le32 *place, u64 val)
 {
 	u32 insn;
 
@@ -215,7 +216,7 @@ static int reloc_insn_adrp(struct module *mod, __le32 *place, u64 val)
 		insn &= ~BIT(31);
 	} else {
 		/* out of range for ADR -> emit a veneer */
-		val = module_emit_veneer_for_adrp(mod, place, val & ~0xfff);
+		val = module_emit_veneer_for_adrp(mod, sechdrs, place, val & ~0xfff);
 		if (!val)
 			return -ENOEXEC;
 		insn = aarch64_insn_gen_branch_imm((u64)place, val,
@@ -368,7 +369,7 @@ int apply_relocate_add(Elf64_Shdr *sechdrs,
 		case R_AARCH64_ADR_PREL_PG_HI21_NC:
 			overflow_check = false;
 		case R_AARCH64_ADR_PREL_PG_HI21:
-			ovf = reloc_insn_adrp(me, loc, val);
+			ovf = reloc_insn_adrp(me, sechdrs, loc, val);
 			if (ovf && ovf != -ERANGE)
 				return ovf;
 			break;
@@ -413,7 +414,7 @@ int apply_relocate_add(Elf64_Shdr *sechdrs,
 
 			if (IS_ENABLED(CONFIG_ARM64_MODULE_PLTS) &&
 			    ovf == -ERANGE) {
-				val = module_emit_plt_entry(me, loc, &rel[i], sym);
+				val = module_emit_plt_entry(me, sechdrs, loc, &rel[i], sym);
 				if (!val)
 					return -ENOEXEC;
 				ovf = reloc_insn_imm(RELOC_OP_PREL, loc, val, 2,
-- 
2.14.5


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH] arm64/module: use plt section indices for relocations
  2018-11-05 17:57   ` [PATCH] arm64/module: use plt section indices for relocations Jessica Yu
@ 2018-11-05 18:04     ` Ard Biesheuvel
  2018-11-05 18:53     ` [PATCH v2] " Jessica Yu
  1 sibling, 0 replies; 48+ messages in thread
From: Ard Biesheuvel @ 2018-11-05 18:04 UTC (permalink / raw)
  To: Jessica Yu
  Cc: Torsten Duwe, Will Deacon, Catalin Marinas, Julien Thierry,
	Steven Rostedt, Josh Poimboeuf, Ingo Molnar, Arnd Bergmann,
	AKASHI Takahiro, linux-arm-kernel, Linux Kernel Mailing List,
	live-patching

On 5 November 2018 at 18:57, Jessica Yu <jeyu@kernel.org> wrote:
> Instead of saving a pointer to the .plt and .init.plt sections to apply
> plt-based relocations, save and use their section indices instead.
>
> The mod->arch.{core,init}.plt pointers were problematic for livepatch
> because they pointed within temporary section headers (provided by the
> module loader via info->sechdrs) that would be freed after module load.
> Since livepatch modules may need to apply relocations post-module-load
> (for example, to patch a module that is loaded later), using section
> indices to offset into the section headers (instead of accessing them
> through a saved pointer) allows livepatch modules on arm64 to pass in
> their own copy of the section headers to apply_relocate_add() to apply
> delayed relocations.
>
> Signed-off-by: Jessica Yu <jeyu@kernel.org>
> ---
>
> Note: Addressed Will's comment about the pltsec -> plt_info rename and
> removed that change to reduce unnecessary code churn. That is the only
> change from the first version. I didn't include the Ack's for this reason
> so let me know if this version is OK as well.
>
> arch/arm64/include/asm/module.h |  8 +++++---
> arch/arm64/kernel/module-plts.c | 36 ++++++++++++++++++++----------------
> arch/arm64/kernel/module.c      |  9 +++++----
> 3 files changed, 30 insertions(+), 23 deletions(-)
>
> diff --git a/arch/arm64/include/asm/module.h
> b/arch/arm64/include/asm/module.h
> index 97d0ef12e2ff..f81c1847312d 100644
> --- a/arch/arm64/include/asm/module.h
> +++ b/arch/arm64/include/asm/module.h
> @@ -22,7 +22,7 @@
>
> #ifdef CONFIG_ARM64_MODULE_PLTS
> struct mod_plt_sec {
> -       struct elf64_shdr       *plt;
> +       int                     plt_shndx;
>         int                     plt_num_entries;
>         int                     plt_max_entries;
> };
> @@ -36,10 +36,12 @@ struct mod_arch_specific {
> };
> #endif
>
> -u64 module_emit_plt_entry(struct module *mod, void *loc, const Elf64_Rela
> *rela,
> +u64 module_emit_plt_entry(struct module *mod, Elf64_Shdr *sechdrs,
> +                         void *loc, const Elf64_Rela *rela,
>                           Elf64_Sym *sym);
>
> -u64 module_emit_veneer_for_adrp(struct module *mod, void *loc, u64 val);
> +u64 module_emit_veneer_for_adrp(struct module *mod, Elf64_Shdr *sechdrs,
> +                               void *loc, u64 val);
>
> #ifdef CONFIG_RANDOMIZE_BASE
> extern u64 module_alloc_base;
> diff --git a/arch/arm64/kernel/module-plts.c
> b/arch/arm64/kernel/module-plts.c
> index f0690c2ca3e0..c04b0d6abde0 100644
> --- a/arch/arm64/kernel/module-plts.c
> +++ b/arch/arm64/kernel/module-plts.c
> @@ -16,12 +16,13 @@ static bool in_init(const struct module *mod, void *loc)
>         return (u64)loc - (u64)mod->init_layout.base <
> mod->init_layout.size;
> }
>
> -u64 module_emit_plt_entry(struct module *mod, void *loc, const Elf64_Rela
> *rela,
> +u64 module_emit_plt_entry(struct module *mod, Elf64_Shdr *sechdrs,
> +                         void *loc, const Elf64_Rela *rela,
>                           Elf64_Sym *sym)
> {
>         struct mod_plt_sec *pltsec = !in_init(mod, loc) ? &mod->arch.core :
>                                                           &mod->arch.init;
> -       struct plt_entry *plt = (struct plt_entry *)pltsec->plt->sh_addr;
> +       struct plt_entry *plt = (struct plt_entry *)(sechdrs +
> pltsec->plt_shndx)->sh_addr;

What's wrong with

(struct plt_entry *)sechdrs[pltsec->plt_shndx].sh_addr

?


>         int i = pltsec->plt_num_entries;
>         u64 val = sym->st_value + rela->r_addend;
>
> @@ -43,11 +44,12 @@ u64 module_emit_plt_entry(struct module *mod, void *loc,
> const Elf64_Rela *rela,
> }
>
> #ifdef CONFIG_ARM64_ERRATUM_843419
> -u64 module_emit_veneer_for_adrp(struct module *mod, void *loc, u64 val)
> +u64 module_emit_veneer_for_adrp(struct module *mod, Elf64_Shdr *sechdrs,
> +                               void *loc, u64 val)
> {
>         struct mod_plt_sec *pltsec = !in_init(mod, loc) ? &mod->arch.core :
>                                                           &mod->arch.init;
> -       struct plt_entry *plt = (struct plt_entry *)pltsec->plt->sh_addr;
> +       struct plt_entry *plt = (struct plt_entry *)(sechdrs +
> pltsec->plt_shndx)->sh_addr;
>         int i = pltsec->plt_num_entries++;
>         u32 mov0, mov1, mov2, br;
>         int rd;
> @@ -202,7 +204,7 @@ int module_frob_arch_sections(Elf_Ehdr *ehdr, Elf_Shdr
> *sechdrs,
>         unsigned long core_plts = 0;
>         unsigned long init_plts = 0;
>         Elf64_Sym *syms = NULL;
> -       Elf_Shdr *tramp = NULL;
> +       Elf_Shdr *pltsec, *tramp = NULL;
>         int i;
>
>         /*
> @@ -211,9 +213,9 @@ int module_frob_arch_sections(Elf_Ehdr *ehdr, Elf_Shdr
> *sechdrs,
>          */
>         for (i = 0; i < ehdr->e_shnum; i++) {
>                 if (!strcmp(secstrings + sechdrs[i].sh_name, ".plt"))
> -                       mod->arch.core.plt = sechdrs + i;
> +                       mod->arch.core.plt_shndx = i;
>                 else if (!strcmp(secstrings + sechdrs[i].sh_name,
> ".init.plt"))
> -                       mod->arch.init.plt = sechdrs + i;
> +                       mod->arch.init.plt_shndx = i;
>                 else if (IS_ENABLED(CONFIG_DYNAMIC_FTRACE) &&
>                          !strcmp(secstrings + sechdrs[i].sh_name,
>                                  ".text.ftrace_trampoline"))
> @@ -222,7 +224,7 @@ int module_frob_arch_sections(Elf_Ehdr *ehdr, Elf_Shdr
> *sechdrs,
>                         syms = (Elf64_Sym *)sechdrs[i].sh_addr;
>         }
>
> -       if (!mod->arch.core.plt || !mod->arch.init.plt) {
> +       if (!mod->arch.core.plt_shndx || !mod->arch.init.plt_shndx) {
>                 pr_err("%s: module PLT section(s) missing\n", mod->name);
>                 return -ENOEXEC;
>         }
> @@ -254,17 +256,19 @@ int module_frob_arch_sections(Elf_Ehdr *ehdr, Elf_Shdr
> *sechdrs,
>                                                 sechdrs[i].sh_info, dstsec);
>         }
>
> -       mod->arch.core.plt->sh_type = SHT_NOBITS;
> -       mod->arch.core.plt->sh_flags = SHF_EXECINSTR | SHF_ALLOC;
> -       mod->arch.core.plt->sh_addralign = L1_CACHE_BYTES;
> -       mod->arch.core.plt->sh_size = (core_plts  + 1) * sizeof(struct
> plt_entry);
> +       pltsec = sechdrs + mod->arch.core.plt_shndx;
> +       pltsec->sh_type = SHT_NOBITS;
> +       pltsec->sh_flags = SHF_EXECINSTR | SHF_ALLOC;
> +       pltsec->sh_addralign = L1_CACHE_BYTES;
> +       pltsec->sh_size = (core_plts  + 1) * sizeof(struct plt_entry);
>         mod->arch.core.plt_num_entries = 0;
>         mod->arch.core.plt_max_entries = core_plts;
>
> -       mod->arch.init.plt->sh_type = SHT_NOBITS;
> -       mod->arch.init.plt->sh_flags = SHF_EXECINSTR | SHF_ALLOC;
> -       mod->arch.init.plt->sh_addralign = L1_CACHE_BYTES;
> -       mod->arch.init.plt->sh_size = (init_plts + 1) * sizeof(struct
> plt_entry);
> +       pltsec = sechdrs + mod->arch.init.plt_shndx;
> +       pltsec->sh_type = SHT_NOBITS;
> +       pltsec->sh_flags = SHF_EXECINSTR | SHF_ALLOC;
> +       pltsec->sh_addralign = L1_CACHE_BYTES;
> +       pltsec->sh_size = (init_plts + 1) * sizeof(struct plt_entry);
>         mod->arch.init.plt_num_entries = 0;
>         mod->arch.init.plt_max_entries = init_plts;
>
> diff --git a/arch/arm64/kernel/module.c b/arch/arm64/kernel/module.c
> index f0f27aeefb73..c2abe59c6f8f 100644
> --- a/arch/arm64/kernel/module.c
> +++ b/arch/arm64/kernel/module.c
> @@ -198,7 +198,8 @@ static int reloc_insn_imm(enum aarch64_reloc_op op,
> __le32 *place, u64 val,
>         return 0;
> }
>
> -static int reloc_insn_adrp(struct module *mod, __le32 *place, u64 val)
> +static int reloc_insn_adrp(struct module *mod, Elf64_Shdr *sechdrs,
> +                          __le32 *place, u64 val)
> {
>         u32 insn;
>
> @@ -215,7 +216,7 @@ static int reloc_insn_adrp(struct module *mod, __le32
> *place, u64 val)
>                 insn &= ~BIT(31);
>         } else {
>                 /* out of range for ADR -> emit a veneer */
> -               val = module_emit_veneer_for_adrp(mod, place, val & ~0xfff);
> +               val = module_emit_veneer_for_adrp(mod, sechdrs, place, val &
> ~0xfff);
>                 if (!val)
>                         return -ENOEXEC;
>                 insn = aarch64_insn_gen_branch_imm((u64)place, val,
> @@ -368,7 +369,7 @@ int apply_relocate_add(Elf64_Shdr *sechdrs,
>                 case R_AARCH64_ADR_PREL_PG_HI21_NC:
>                         overflow_check = false;
>                 case R_AARCH64_ADR_PREL_PG_HI21:
> -                       ovf = reloc_insn_adrp(me, loc, val);
> +                       ovf = reloc_insn_adrp(me, sechdrs, loc, val);
>                         if (ovf && ovf != -ERANGE)
>                                 return ovf;
>                         break;
> @@ -413,7 +414,7 @@ int apply_relocate_add(Elf64_Shdr *sechdrs,
>
>                         if (IS_ENABLED(CONFIG_ARM64_MODULE_PLTS) &&
>                             ovf == -ERANGE) {
> -                               val = module_emit_plt_entry(me, loc,
> &rel[i], sym);
> +                               val = module_emit_plt_entry(me, sechdrs,
> loc, &rel[i], sym);
>                                 if (!val)
>                                         return -ENOEXEC;
>                                 ovf = reloc_insn_imm(RELOC_OP_PREL, loc,
> val, 2,
> --
> 2.14.5
>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH v2] arm64/module: use plt section indices for relocations
  2018-11-05 17:57   ` [PATCH] arm64/module: use plt section indices for relocations Jessica Yu
  2018-11-05 18:04     ` Ard Biesheuvel
@ 2018-11-05 18:53     ` Jessica Yu
  2018-11-05 18:56       ` Ard Biesheuvel
  2018-11-05 19:26       ` Will Deacon
  1 sibling, 2 replies; 48+ messages in thread
From: Jessica Yu @ 2018-11-05 18:53 UTC (permalink / raw)
  To: Torsten Duwe
  Cc: Will Deacon, Catalin Marinas, Julien Thierry, Steven Rostedt,
	Josh Poimboeuf, Ingo Molnar, Ard Biesheuvel, Arnd Bergmann,
	AKASHI Takahiro, linux-arm-kernel, linux-kernel, live-patching

Instead of saving a pointer to the .plt and .init.plt sections to apply
plt-based relocations, save and use their section indices instead.

The mod->arch.{core,init}.plt pointers were problematic for livepatch
because they pointed within temporary section headers (provided by the
module loader via info->sechdrs) that would be freed after module load.
Since livepatch modules may need to apply relocations post-module-load
(for example, to patch a module that is loaded later), using section
indices to offset into the section headers (instead of accessing them
through a saved pointer) allows livepatch modules on arm64 to pass in
their own copy of the section headers to apply_relocate_add() to apply
delayed relocations.

Signed-off-by: Jessica Yu <jeyu@kernel.org>
---

v2: 

 - Do sechdrs[pltsec->plt_shndx].sh_addr instead of pointer math

Note: Addressed Will's comment about the pltsec -> plt_info rename and
removed that change to reduce unnecessary code churn. I didn't include the
Ack's for this reason so let me know if this version is OK as well.

 arch/arm64/include/asm/module.h |  8 +++++---
 arch/arm64/kernel/module-plts.c | 36 ++++++++++++++++++++----------------
 arch/arm64/kernel/module.c      |  9 +++++----
 3 files changed, 30 insertions(+), 23 deletions(-)

diff --git a/arch/arm64/include/asm/module.h b/arch/arm64/include/asm/module.h
index 97d0ef12e2ff..f81c1847312d 100644
--- a/arch/arm64/include/asm/module.h
+++ b/arch/arm64/include/asm/module.h
@@ -22,7 +22,7 @@
 
 #ifdef CONFIG_ARM64_MODULE_PLTS
 struct mod_plt_sec {
-	struct elf64_shdr	*plt;
+	int			plt_shndx;
 	int			plt_num_entries;
 	int			plt_max_entries;
 };
@@ -36,10 +36,12 @@ struct mod_arch_specific {
 };
 #endif
 
-u64 module_emit_plt_entry(struct module *mod, void *loc, const Elf64_Rela *rela,
+u64 module_emit_plt_entry(struct module *mod, Elf64_Shdr *sechdrs,
+			  void *loc, const Elf64_Rela *rela,
 			  Elf64_Sym *sym);
 
-u64 module_emit_veneer_for_adrp(struct module *mod, void *loc, u64 val);
+u64 module_emit_veneer_for_adrp(struct module *mod, Elf64_Shdr *sechdrs,
+				void *loc, u64 val);
 
 #ifdef CONFIG_RANDOMIZE_BASE
 extern u64 module_alloc_base;
diff --git a/arch/arm64/kernel/module-plts.c b/arch/arm64/kernel/module-plts.c
index f0690c2ca3e0..a0efe30211d6 100644
--- a/arch/arm64/kernel/module-plts.c
+++ b/arch/arm64/kernel/module-plts.c
@@ -16,12 +16,13 @@ static bool in_init(const struct module *mod, void *loc)
 	return (u64)loc - (u64)mod->init_layout.base < mod->init_layout.size;
 }
 
-u64 module_emit_plt_entry(struct module *mod, void *loc, const Elf64_Rela *rela,
+u64 module_emit_plt_entry(struct module *mod, Elf64_Shdr *sechdrs,
+			  void *loc, const Elf64_Rela *rela,
 			  Elf64_Sym *sym)
 {
 	struct mod_plt_sec *pltsec = !in_init(mod, loc) ? &mod->arch.core :
 							  &mod->arch.init;
-	struct plt_entry *plt = (struct plt_entry *)pltsec->plt->sh_addr;
+	struct plt_entry *plt = (struct plt_entry *)sechdrs[pltsec->plt_shndx].sh_addr;
 	int i = pltsec->plt_num_entries;
 	u64 val = sym->st_value + rela->r_addend;
 
@@ -43,11 +44,12 @@ u64 module_emit_plt_entry(struct module *mod, void *loc, const Elf64_Rela *rela,
 }
 
 #ifdef CONFIG_ARM64_ERRATUM_843419
-u64 module_emit_veneer_for_adrp(struct module *mod, void *loc, u64 val)
+u64 module_emit_veneer_for_adrp(struct module *mod, Elf64_Shdr *sechdrs,
+				void *loc, u64 val)
 {
 	struct mod_plt_sec *pltsec = !in_init(mod, loc) ? &mod->arch.core :
 							  &mod->arch.init;
-	struct plt_entry *plt = (struct plt_entry *)pltsec->plt->sh_addr;
+	struct plt_entry *plt = (struct plt_entry *)sechdrs[pltsec->plt_shndx].sh_addr;
 	int i = pltsec->plt_num_entries++;
 	u32 mov0, mov1, mov2, br;
 	int rd;
@@ -202,7 +204,7 @@ int module_frob_arch_sections(Elf_Ehdr *ehdr, Elf_Shdr *sechdrs,
 	unsigned long core_plts = 0;
 	unsigned long init_plts = 0;
 	Elf64_Sym *syms = NULL;
-	Elf_Shdr *tramp = NULL;
+	Elf_Shdr *pltsec, *tramp = NULL;
 	int i;
 
 	/*
@@ -211,9 +213,9 @@ int module_frob_arch_sections(Elf_Ehdr *ehdr, Elf_Shdr *sechdrs,
 	 */
 	for (i = 0; i < ehdr->e_shnum; i++) {
 		if (!strcmp(secstrings + sechdrs[i].sh_name, ".plt"))
-			mod->arch.core.plt = sechdrs + i;
+			mod->arch.core.plt_shndx = i;
 		else if (!strcmp(secstrings + sechdrs[i].sh_name, ".init.plt"))
-			mod->arch.init.plt = sechdrs + i;
+			mod->arch.init.plt_shndx = i;
 		else if (IS_ENABLED(CONFIG_DYNAMIC_FTRACE) &&
 			 !strcmp(secstrings + sechdrs[i].sh_name,
 				 ".text.ftrace_trampoline"))
@@ -222,7 +224,7 @@ int module_frob_arch_sections(Elf_Ehdr *ehdr, Elf_Shdr *sechdrs,
 			syms = (Elf64_Sym *)sechdrs[i].sh_addr;
 	}
 
-	if (!mod->arch.core.plt || !mod->arch.init.plt) {
+	if (!mod->arch.core.plt_shndx || !mod->arch.init.plt_shndx) {
 		pr_err("%s: module PLT section(s) missing\n", mod->name);
 		return -ENOEXEC;
 	}
@@ -254,17 +256,19 @@ int module_frob_arch_sections(Elf_Ehdr *ehdr, Elf_Shdr *sechdrs,
 						sechdrs[i].sh_info, dstsec);
 	}
 
-	mod->arch.core.plt->sh_type = SHT_NOBITS;
-	mod->arch.core.plt->sh_flags = SHF_EXECINSTR | SHF_ALLOC;
-	mod->arch.core.plt->sh_addralign = L1_CACHE_BYTES;
-	mod->arch.core.plt->sh_size = (core_plts  + 1) * sizeof(struct plt_entry);
+	pltsec = sechdrs + mod->arch.core.plt_shndx;
+	pltsec->sh_type = SHT_NOBITS;
+	pltsec->sh_flags = SHF_EXECINSTR | SHF_ALLOC;
+	pltsec->sh_addralign = L1_CACHE_BYTES;
+	pltsec->sh_size = (core_plts  + 1) * sizeof(struct plt_entry);
 	mod->arch.core.plt_num_entries = 0;
 	mod->arch.core.plt_max_entries = core_plts;
 
-	mod->arch.init.plt->sh_type = SHT_NOBITS;
-	mod->arch.init.plt->sh_flags = SHF_EXECINSTR | SHF_ALLOC;
-	mod->arch.init.plt->sh_addralign = L1_CACHE_BYTES;
-	mod->arch.init.plt->sh_size = (init_plts + 1) * sizeof(struct plt_entry);
+	pltsec = sechdrs + mod->arch.init.plt_shndx;
+	pltsec->sh_type = SHT_NOBITS;
+	pltsec->sh_flags = SHF_EXECINSTR | SHF_ALLOC;
+	pltsec->sh_addralign = L1_CACHE_BYTES;
+	pltsec->sh_size = (init_plts + 1) * sizeof(struct plt_entry);
 	mod->arch.init.plt_num_entries = 0;
 	mod->arch.init.plt_max_entries = init_plts;
 
diff --git a/arch/arm64/kernel/module.c b/arch/arm64/kernel/module.c
index f0f27aeefb73..c2abe59c6f8f 100644
--- a/arch/arm64/kernel/module.c
+++ b/arch/arm64/kernel/module.c
@@ -198,7 +198,8 @@ static int reloc_insn_imm(enum aarch64_reloc_op op, __le32 *place, u64 val,
 	return 0;
 }
 
-static int reloc_insn_adrp(struct module *mod, __le32 *place, u64 val)
+static int reloc_insn_adrp(struct module *mod, Elf64_Shdr *sechdrs,
+			   __le32 *place, u64 val)
 {
 	u32 insn;
 
@@ -215,7 +216,7 @@ static int reloc_insn_adrp(struct module *mod, __le32 *place, u64 val)
 		insn &= ~BIT(31);
 	} else {
 		/* out of range for ADR -> emit a veneer */
-		val = module_emit_veneer_for_adrp(mod, place, val & ~0xfff);
+		val = module_emit_veneer_for_adrp(mod, sechdrs, place, val & ~0xfff);
 		if (!val)
 			return -ENOEXEC;
 		insn = aarch64_insn_gen_branch_imm((u64)place, val,
@@ -368,7 +369,7 @@ int apply_relocate_add(Elf64_Shdr *sechdrs,
 		case R_AARCH64_ADR_PREL_PG_HI21_NC:
 			overflow_check = false;
 		case R_AARCH64_ADR_PREL_PG_HI21:
-			ovf = reloc_insn_adrp(me, loc, val);
+			ovf = reloc_insn_adrp(me, sechdrs, loc, val);
 			if (ovf && ovf != -ERANGE)
 				return ovf;
 			break;
@@ -413,7 +414,7 @@ int apply_relocate_add(Elf64_Shdr *sechdrs,
 
 			if (IS_ENABLED(CONFIG_ARM64_MODULE_PLTS) &&
 			    ovf == -ERANGE) {
-				val = module_emit_plt_entry(me, loc, &rel[i], sym);
+				val = module_emit_plt_entry(me, sechdrs, loc, &rel[i], sym);
 				if (!val)
 					return -ENOEXEC;
 				ovf = reloc_insn_imm(RELOC_OP_PREL, loc, val, 2,
-- 
2.14.5


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2] arm64/module: use plt section indices for relocations
  2018-11-05 18:53     ` [PATCH v2] " Jessica Yu
@ 2018-11-05 18:56       ` Ard Biesheuvel
  2018-11-05 19:26       ` Will Deacon
  1 sibling, 0 replies; 48+ messages in thread
From: Ard Biesheuvel @ 2018-11-05 18:56 UTC (permalink / raw)
  To: Jessica Yu
  Cc: Torsten Duwe, Will Deacon, Catalin Marinas, Julien Thierry,
	Steven Rostedt, Josh Poimboeuf, Ingo Molnar, Arnd Bergmann,
	AKASHI Takahiro, linux-arm-kernel, Linux Kernel Mailing List,
	live-patching

On 5 November 2018 at 19:53, Jessica Yu <jeyu@kernel.org> wrote:
> Instead of saving a pointer to the .plt and .init.plt sections to apply
> plt-based relocations, save and use their section indices instead.
>
> The mod->arch.{core,init}.plt pointers were problematic for livepatch
> because they pointed within temporary section headers (provided by the
> module loader via info->sechdrs) that would be freed after module load.
> Since livepatch modules may need to apply relocations post-module-load
> (for example, to patch a module that is loaded later), using section
> indices to offset into the section headers (instead of accessing them
> through a saved pointer) allows livepatch modules on arm64 to pass in
> their own copy of the section headers to apply_relocate_add() to apply
> delayed relocations.
>
> Signed-off-by: Jessica Yu <jeyu@kernel.org>

Thanks Jessica

Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>

> ---
>
> v2:
> - Do sechdrs[pltsec->plt_shndx].sh_addr instead of pointer math
>
> Note: Addressed Will's comment about the pltsec -> plt_info rename and
> removed that change to reduce unnecessary code churn. I didn't include the
>
> Ack's for this reason so let me know if this version is OK as well.
>
> arch/arm64/include/asm/module.h |  8 +++++---
> arch/arm64/kernel/module-plts.c | 36 ++++++++++++++++++++----------------
> arch/arm64/kernel/module.c      |  9 +++++----
> 3 files changed, 30 insertions(+), 23 deletions(-)
>
> diff --git a/arch/arm64/include/asm/module.h
> b/arch/arm64/include/asm/module.h
> index 97d0ef12e2ff..f81c1847312d 100644
> --- a/arch/arm64/include/asm/module.h
> +++ b/arch/arm64/include/asm/module.h
> @@ -22,7 +22,7 @@
>
> #ifdef CONFIG_ARM64_MODULE_PLTS
> struct mod_plt_sec {
> -       struct elf64_shdr       *plt;
> +       int                     plt_shndx;
>         int                     plt_num_entries;
>         int                     plt_max_entries;
> };
> @@ -36,10 +36,12 @@ struct mod_arch_specific {
> };
> #endif
>
> -u64 module_emit_plt_entry(struct module *mod, void *loc, const Elf64_Rela
> *rela,
> +u64 module_emit_plt_entry(struct module *mod, Elf64_Shdr *sechdrs,
> +                         void *loc, const Elf64_Rela *rela,
>                           Elf64_Sym *sym);
>
> -u64 module_emit_veneer_for_adrp(struct module *mod, void *loc, u64 val);
> +u64 module_emit_veneer_for_adrp(struct module *mod, Elf64_Shdr *sechdrs,
> +                               void *loc, u64 val);
>
> #ifdef CONFIG_RANDOMIZE_BASE
> extern u64 module_alloc_base;
> diff --git a/arch/arm64/kernel/module-plts.c
> b/arch/arm64/kernel/module-plts.c
> index f0690c2ca3e0..a0efe30211d6 100644
> --- a/arch/arm64/kernel/module-plts.c
> +++ b/arch/arm64/kernel/module-plts.c
> @@ -16,12 +16,13 @@ static bool in_init(const struct module *mod, void *loc)
>         return (u64)loc - (u64)mod->init_layout.base <
> mod->init_layout.size;
> }
>
> -u64 module_emit_plt_entry(struct module *mod, void *loc, const Elf64_Rela
> *rela,
> +u64 module_emit_plt_entry(struct module *mod, Elf64_Shdr *sechdrs,
> +                         void *loc, const Elf64_Rela *rela,
>                           Elf64_Sym *sym)
> {
>         struct mod_plt_sec *pltsec = !in_init(mod, loc) ? &mod->arch.core :
>                                                           &mod->arch.init;
> -       struct plt_entry *plt = (struct plt_entry *)pltsec->plt->sh_addr;
> +       struct plt_entry *plt = (struct plt_entry
> *)sechdrs[pltsec->plt_shndx].sh_addr;
>         int i = pltsec->plt_num_entries;
>         u64 val = sym->st_value + rela->r_addend;
>
> @@ -43,11 +44,12 @@ u64 module_emit_plt_entry(struct module *mod, void *loc,
> const Elf64_Rela *rela,
> }
>
> #ifdef CONFIG_ARM64_ERRATUM_843419
> -u64 module_emit_veneer_for_adrp(struct module *mod, void *loc, u64 val)
> +u64 module_emit_veneer_for_adrp(struct module *mod, Elf64_Shdr *sechdrs,
> +                               void *loc, u64 val)
> {
>         struct mod_plt_sec *pltsec = !in_init(mod, loc) ? &mod->arch.core :
>                                                           &mod->arch.init;
> -       struct plt_entry *plt = (struct plt_entry *)pltsec->plt->sh_addr;
> +       struct plt_entry *plt = (struct plt_entry
> *)sechdrs[pltsec->plt_shndx].sh_addr;
>
>         int i = pltsec->plt_num_entries++;
>         u32 mov0, mov1, mov2, br;
>         int rd;
> @@ -202,7 +204,7 @@ int module_frob_arch_sections(Elf_Ehdr *ehdr, Elf_Shdr
> *sechdrs,
>         unsigned long core_plts = 0;
>         unsigned long init_plts = 0;
>         Elf64_Sym *syms = NULL;
> -       Elf_Shdr *tramp = NULL;
> +       Elf_Shdr *pltsec, *tramp = NULL;
>         int i;
>
>         /*
> @@ -211,9 +213,9 @@ int module_frob_arch_sections(Elf_Ehdr *ehdr, Elf_Shdr
> *sechdrs,
>          */
>         for (i = 0; i < ehdr->e_shnum; i++) {
>                 if (!strcmp(secstrings + sechdrs[i].sh_name, ".plt"))
> -                       mod->arch.core.plt = sechdrs + i;
> +                       mod->arch.core.plt_shndx = i;
>                 else if (!strcmp(secstrings + sechdrs[i].sh_name,
> ".init.plt"))
> -                       mod->arch.init.plt = sechdrs + i;
> +                       mod->arch.init.plt_shndx = i;
>                 else if (IS_ENABLED(CONFIG_DYNAMIC_FTRACE) &&
>                          !strcmp(secstrings + sechdrs[i].sh_name,
>                                  ".text.ftrace_trampoline"))
> @@ -222,7 +224,7 @@ int module_frob_arch_sections(Elf_Ehdr *ehdr, Elf_Shdr
> *sechdrs,
>                         syms = (Elf64_Sym *)sechdrs[i].sh_addr;
>         }
>
> -       if (!mod->arch.core.plt || !mod->arch.init.plt) {
> +       if (!mod->arch.core.plt_shndx || !mod->arch.init.plt_shndx) {
>                 pr_err("%s: module PLT section(s) missing\n", mod->name);
>                 return -ENOEXEC;
>         }
> @@ -254,17 +256,19 @@ int module_frob_arch_sections(Elf_Ehdr *ehdr, Elf_Shdr
> *sechdrs,
>                                                 sechdrs[i].sh_info, dstsec);
>         }
>
> -       mod->arch.core.plt->sh_type = SHT_NOBITS;
> -       mod->arch.core.plt->sh_flags = SHF_EXECINSTR | SHF_ALLOC;
> -       mod->arch.core.plt->sh_addralign = L1_CACHE_BYTES;
> -       mod->arch.core.plt->sh_size = (core_plts  + 1) * sizeof(struct
> plt_entry);
> +       pltsec = sechdrs + mod->arch.core.plt_shndx;
> +       pltsec->sh_type = SHT_NOBITS;
> +       pltsec->sh_flags = SHF_EXECINSTR | SHF_ALLOC;
> +       pltsec->sh_addralign = L1_CACHE_BYTES;
> +       pltsec->sh_size = (core_plts  + 1) * sizeof(struct plt_entry);
>         mod->arch.core.plt_num_entries = 0;
>         mod->arch.core.plt_max_entries = core_plts;
>
> -       mod->arch.init.plt->sh_type = SHT_NOBITS;
> -       mod->arch.init.plt->sh_flags = SHF_EXECINSTR | SHF_ALLOC;
> -       mod->arch.init.plt->sh_addralign = L1_CACHE_BYTES;
> -       mod->arch.init.plt->sh_size = (init_plts + 1) * sizeof(struct
> plt_entry);
> +       pltsec = sechdrs + mod->arch.init.plt_shndx;
> +       pltsec->sh_type = SHT_NOBITS;
> +       pltsec->sh_flags = SHF_EXECINSTR | SHF_ALLOC;
> +       pltsec->sh_addralign = L1_CACHE_BYTES;
> +       pltsec->sh_size = (init_plts + 1) * sizeof(struct plt_entry);
>         mod->arch.init.plt_num_entries = 0;
>         mod->arch.init.plt_max_entries = init_plts;
>
> diff --git a/arch/arm64/kernel/module.c b/arch/arm64/kernel/module.c
> index f0f27aeefb73..c2abe59c6f8f 100644
> --- a/arch/arm64/kernel/module.c
> +++ b/arch/arm64/kernel/module.c
> @@ -198,7 +198,8 @@ static int reloc_insn_imm(enum aarch64_reloc_op op,
> __le32 *place, u64 val,
>         return 0;
> }
>
> -static int reloc_insn_adrp(struct module *mod, __le32 *place, u64 val)
> +static int reloc_insn_adrp(struct module *mod, Elf64_Shdr *sechdrs,
> +                          __le32 *place, u64 val)
> {
>         u32 insn;
>
> @@ -215,7 +216,7 @@ static int reloc_insn_adrp(struct module *mod, __le32
> *place, u64 val)
>                 insn &= ~BIT(31);
>         } else {
>                 /* out of range for ADR -> emit a veneer */
> -               val = module_emit_veneer_for_adrp(mod, place, val & ~0xfff);
> +               val = module_emit_veneer_for_adrp(mod, sechdrs, place, val &
> ~0xfff);
>                 if (!val)
>                         return -ENOEXEC;
>                 insn = aarch64_insn_gen_branch_imm((u64)place, val,
> @@ -368,7 +369,7 @@ int apply_relocate_add(Elf64_Shdr *sechdrs,
>                 case R_AARCH64_ADR_PREL_PG_HI21_NC:
>                         overflow_check = false;
>                 case R_AARCH64_ADR_PREL_PG_HI21:
> -                       ovf = reloc_insn_adrp(me, loc, val);
> +                       ovf = reloc_insn_adrp(me, sechdrs, loc, val);
>                         if (ovf && ovf != -ERANGE)
>                                 return ovf;
>                         break;
> @@ -413,7 +414,7 @@ int apply_relocate_add(Elf64_Shdr *sechdrs,
>
>                         if (IS_ENABLED(CONFIG_ARM64_MODULE_PLTS) &&
>                             ovf == -ERANGE) {
> -                               val = module_emit_plt_entry(me, loc,
> &rel[i], sym);
> +                               val = module_emit_plt_entry(me, sechdrs,
> loc, &rel[i], sym);
>                                 if (!val)
>                                         return -ENOEXEC;
>                                 ovf = reloc_insn_imm(RELOC_OP_PREL, loc,
> val, 2,
> --
> 2.14.5
>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2] arm64/module: use plt section indices for relocations
  2018-11-05 18:53     ` [PATCH v2] " Jessica Yu
  2018-11-05 18:56       ` Ard Biesheuvel
@ 2018-11-05 19:26       ` Will Deacon
  2018-11-05 19:49         ` Jessica Yu
  2018-11-06  9:44         ` Miroslav Benes
  1 sibling, 2 replies; 48+ messages in thread
From: Will Deacon @ 2018-11-05 19:26 UTC (permalink / raw)
  To: Jessica Yu
  Cc: Torsten Duwe, Catalin Marinas, Julien Thierry, Steven Rostedt,
	Josh Poimboeuf, Ingo Molnar, Ard Biesheuvel, Arnd Bergmann,
	AKASHI Takahiro, linux-arm-kernel, linux-kernel, live-patching

On Mon, Nov 05, 2018 at 07:53:23PM +0100, Jessica Yu wrote:
> Instead of saving a pointer to the .plt and .init.plt sections to apply
> plt-based relocations, save and use their section indices instead.
> 
> The mod->arch.{core,init}.plt pointers were problematic for livepatch
> because they pointed within temporary section headers (provided by the
> module loader via info->sechdrs) that would be freed after module load.
> Since livepatch modules may need to apply relocations post-module-load
> (for example, to patch a module that is loaded later), using section
> indices to offset into the section headers (instead of accessing them
> through a saved pointer) allows livepatch modules on arm64 to pass in
> their own copy of the section headers to apply_relocate_add() to apply
> delayed relocations.
> 
> Signed-off-by: Jessica Yu <jeyu@kernel.org>
> ---
> 
> v2:
> 
> - Do sechdrs[pltsec->plt_shndx].sh_addr instead of pointer math
> 
> Note: Addressed Will's comment about the pltsec -> plt_info rename and
> removed that change to reduce unnecessary code churn. I didn't include the
> Ack's for this reason so let me know if this version is OK as well.

Thanks, Jessica!

Acked-by: Will Deacon <will.deacon@arm.com>

> arch/arm64/include/asm/module.h |  8 +++++---
> arch/arm64/kernel/module-plts.c | 36 ++++++++++++++++++++----------------
> arch/arm64/kernel/module.c      |  9 +++++----
> 3 files changed, 30 insertions(+), 23 deletions(-)

Actually, I guess I should just queue this for 4.21 given that it's
completely self-contained.

Will

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2] arm64/module: use plt section indices for relocations
  2018-11-05 19:26       ` Will Deacon
@ 2018-11-05 19:49         ` Jessica Yu
  2018-11-06  9:44         ` Miroslav Benes
  1 sibling, 0 replies; 48+ messages in thread
From: Jessica Yu @ 2018-11-05 19:49 UTC (permalink / raw)
  To: Will Deacon
  Cc: Torsten Duwe, Catalin Marinas, Julien Thierry, Steven Rostedt,
	Josh Poimboeuf, Ingo Molnar, Ard Biesheuvel, Arnd Bergmann,
	AKASHI Takahiro, linux-arm-kernel, linux-kernel, live-patching

+++ Will Deacon [05/11/18 19:26 +0000]:
>On Mon, Nov 05, 2018 at 07:53:23PM +0100, Jessica Yu wrote:
>> Instead of saving a pointer to the .plt and .init.plt sections to apply
>> plt-based relocations, save and use their section indices instead.
>>
>> The mod->arch.{core,init}.plt pointers were problematic for livepatch
>> because they pointed within temporary section headers (provided by the
>> module loader via info->sechdrs) that would be freed after module load.
>> Since livepatch modules may need to apply relocations post-module-load
>> (for example, to patch a module that is loaded later), using section
>> indices to offset into the section headers (instead of accessing them
>> through a saved pointer) allows livepatch modules on arm64 to pass in
>> their own copy of the section headers to apply_relocate_add() to apply
>> delayed relocations.
>>
>> Signed-off-by: Jessica Yu <jeyu@kernel.org>
>> ---
>>
>> v2:
>>
>> - Do sechdrs[pltsec->plt_shndx].sh_addr instead of pointer math
>>
>> Note: Addressed Will's comment about the pltsec -> plt_info rename and
>> removed that change to reduce unnecessary code churn. I didn't include the
>> Ack's for this reason so let me know if this version is OK as well.
>
>Thanks, Jessica!
>
>Acked-by: Will Deacon <will.deacon@arm.com>
>
>> arch/arm64/include/asm/module.h |  8 +++++---
>> arch/arm64/kernel/module-plts.c | 36 ++++++++++++++++++++----------------
>> arch/arm64/kernel/module.c      |  9 +++++----
>> 3 files changed, 30 insertions(+), 23 deletions(-)
>
>Actually, I guess I should just queue this for 4.21 given that it's
>completely self-contained.

Yes, that's fine :-) Then Torsten won't have to include this patch to his
patchset.

Thank you!

Jessica

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2] arm64/module: use plt section indices for relocations
  2018-11-05 19:26       ` Will Deacon
  2018-11-05 19:49         ` Jessica Yu
@ 2018-11-06  9:44         ` Miroslav Benes
  1 sibling, 0 replies; 48+ messages in thread
From: Miroslav Benes @ 2018-11-06  9:44 UTC (permalink / raw)
  To: Will Deacon
  Cc: Jessica Yu, Torsten Duwe, Catalin Marinas, Julien Thierry,
	Steven Rostedt, Josh Poimboeuf, Ingo Molnar, Ard Biesheuvel,
	Arnd Bergmann, AKASHI Takahiro, linux-arm-kernel, linux-kernel,
	live-patching

On Mon, 5 Nov 2018, Will Deacon wrote:

> On Mon, Nov 05, 2018 at 07:53:23PM +0100, Jessica Yu wrote:
> > Instead of saving a pointer to the .plt and .init.plt sections to apply
> > plt-based relocations, save and use their section indices instead.
> > 
> > The mod->arch.{core,init}.plt pointers were problematic for livepatch
> > because they pointed within temporary section headers (provided by the
> > module loader via info->sechdrs) that would be freed after module load.
> > Since livepatch modules may need to apply relocations post-module-load
> > (for example, to patch a module that is loaded later), using section
> > indices to offset into the section headers (instead of accessing them
> > through a saved pointer) allows livepatch modules on arm64 to pass in
> > their own copy of the section headers to apply_relocate_add() to apply
> > delayed relocations.
> > 
> > Signed-off-by: Jessica Yu <jeyu@kernel.org>
> > ---
> > 
> > v2:
> > 
> > - Do sechdrs[pltsec->plt_shndx].sh_addr instead of pointer math
> > 
> > Note: Addressed Will's comment about the pltsec -> plt_info rename and
> > removed that change to reduce unnecessary code churn. I didn't include the
> > Ack's for this reason so let me know if this version is OK as well.
> 
> Thanks, Jessica!
> 
> Acked-by: Will Deacon <will.deacon@arm.com>
> 
> > arch/arm64/include/asm/module.h |  8 +++++---
> > arch/arm64/kernel/module-plts.c | 36 ++++++++++++++++++++----------------
> > arch/arm64/kernel/module.c      |  9 +++++----
> > 3 files changed, 30 insertions(+), 23 deletions(-)
> 
> Actually, I guess I should just queue this for 4.21 given that it's
> completely self-contained.

FWIW you can also add my

Reviewed-by: Miroslav Benes <mbenes@suse.cz>

Thanks, Jessica.

Miroslav

^ permalink raw reply	[flat|nested] 48+ messages in thread

end of thread, other threads:[~2018-11-06  9:44 UTC | newest]

Thread overview: 48+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-01 14:09 [PATCH v3 0/4] arm64 live patching Torsten Duwe
2018-10-01 14:16 ` [PATCH v3 1/4] DYNAMIC_FTRACE configurable with and without REGS Torsten Duwe
2018-10-01 14:52   ` Ard Biesheuvel
2018-10-01 15:03     ` Torsten Duwe
2018-10-01 15:06       ` Ard Biesheuvel
2018-10-01 15:10         ` Torsten Duwe
2018-10-01 15:14           ` Steven Rostedt
2018-10-01 14:16 ` [PATCH v3 2/4] arm64: implement ftrace with regs Torsten Duwe
2018-10-01 15:57   ` Ard Biesheuvel
2018-10-02 10:02     ` Torsten Duwe
2018-10-02 10:39       ` Ard Biesheuvel
2018-10-02 11:27   ` Mark Rutland
2018-10-02 12:18     ` Torsten Duwe
2018-10-02 12:57       ` Mark Rutland
2018-10-01 14:16 ` [PATCH v3 3/4] arm64: implement live patching Torsten Duwe
2018-10-17 13:39   ` Miroslav Benes
2018-10-18 12:58     ` Jessica Yu
2018-10-19 11:59       ` Miroslav Benes
2018-10-19 12:18         ` Jessica Yu
2018-10-19 15:14           ` Miroslav Benes
2018-10-19 13:46         ` Torsten Duwe
2018-10-19 13:52       ` Ard Biesheuvel
2018-10-19 15:21         ` Miroslav Benes
2018-10-20 14:10           ` Ard Biesheuvel
2018-10-22 12:53             ` Miroslav Benes
2018-10-22 14:54               ` Torsten Duwe
2018-10-23 17:55   ` [PATCH] arm64/module: use mod->klp_info section header information Jessica Yu
2018-10-23 19:32     ` kbuild test robot
2018-10-24 11:57     ` Miroslav Benes
2018-10-25  8:08     ` Petr Mladek
2018-10-25  9:00       ` Miroslav Benes
2018-10-25 11:42         ` Jessica Yu
2018-10-26 17:25     ` [PATCH v2] arm64/module: use mod->klp_info section header information for livepatch modules Jessica Yu
2018-10-29 13:24       ` Miroslav Benes
2018-10-29 13:32         ` Jessica Yu
2018-10-29 15:28       ` Will Deacon
2018-10-30 13:19         ` Jessica Yu
2018-11-01 15:18           ` Miroslav Benes
2018-11-01 16:07           ` Will Deacon
2018-11-05 12:30             ` Ard Biesheuvel
2018-11-05 17:57   ` [PATCH] arm64/module: use plt section indices for relocations Jessica Yu
2018-11-05 18:04     ` Ard Biesheuvel
2018-11-05 18:53     ` [PATCH v2] " Jessica Yu
2018-11-05 18:56       ` Ard Biesheuvel
2018-11-05 19:26       ` Will Deacon
2018-11-05 19:49         ` Jessica Yu
2018-11-06  9:44         ` Miroslav Benes
2018-10-01 14:16 ` [PATCH v3 4/4] arm64: reliable stacktraces Torsten Duwe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).