All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/2] arm64: ftrace: fix interop issues with module PLTs
@ 2017-05-30 13:52 Ard Biesheuvel
  2017-05-30 13:52 ` [PATCH v2 1/2] arm64: ftrace: don't validate branch via PLT in ftrace_make_nop() Ard Biesheuvel
  2017-05-30 13:52 ` [PATCH v2 2/2] arm64: ftrace: add support for far branches to dynamic ftrace Ard Biesheuvel
  0 siblings, 2 replies; 11+ messages in thread
From: Ard Biesheuvel @ 2017-05-30 13:52 UTC (permalink / raw)
  To: linux-arm-kernel

The arm64 ftrace code assumes that all core kernel symbols are within the
range of an ordinary branch instruction from anywhere in the kernel, even
in modules. While this is usually the case, it may not be true when running
with KASLR enabled.

This series fixes two distinct but related issues involving ftrace under
KASLR:

- the initial sweep over all _mcount() calls in newly loaded modules
  validates those calls by comparing each to a 'bl _mcount' opcode, which
  will fail if _mcount() [which is in the core kernel] is out of range, and
  which will result in false negatives, given that those call sites will
  contain a correct branch to _mcount(), but via a PLT entry (patch #1)

- patching up those NOPs into calls to ftrace_caller() [which is the only
  ftrace branch target that we ever use on arm64, IIUC] will fail in the
  same way, and needs to be redirected via a trampoline as well (patch #2)

v2:
- no changes to #1
- don't hardcode references to ftrace_caller() (#2)
- avoid a stack frame for the trampoline, given that it will confuse the
  ftrace code (#2)

Ard Biesheuvel (2):
  arm64: ftrace: don't validate branch via PLT in ftrace_make_nop()
  arm64: ftrace: add support for far branches to dynamic ftrace

 arch/arm64/Kconfig              |  2 +-
 arch/arm64/Makefile             |  3 ++
 arch/arm64/include/asm/module.h |  3 ++
 arch/arm64/kernel/Makefile      |  3 ++
 arch/arm64/kernel/ftrace-mod.S  | 18 +++++++
 arch/arm64/kernel/ftrace.c      | 57 ++++++++++++++++++--
 arch/arm64/kernel/module-plts.c | 10 ++++
 7 files changed, 91 insertions(+), 5 deletions(-)
 create mode 100644 arch/arm64/kernel/ftrace-mod.S

-- 
2.9.3

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v2 1/2] arm64: ftrace: don't validate branch via PLT in ftrace_make_nop()
  2017-05-30 13:52 [PATCH v2 0/2] arm64: ftrace: fix interop issues with module PLTs Ard Biesheuvel
@ 2017-05-30 13:52 ` Ard Biesheuvel
  2017-06-05 17:15   ` Will Deacon
  2017-05-30 13:52 ` [PATCH v2 2/2] arm64: ftrace: add support for far branches to dynamic ftrace Ard Biesheuvel
  1 sibling, 1 reply; 11+ messages in thread
From: Ard Biesheuvel @ 2017-05-30 13:52 UTC (permalink / raw)
  To: linux-arm-kernel

When turning branch instructions into NOPs, we attempt to validate the
action by comparing the old value at the call site with the opcode of
a direct relative branch instruction pointing at the old target.

However, these call sites are statically initialized to call _mcount(),
and may be redirected via a PLT entry if the module is loaded far away
from the kernel text, leading to false negatives and spurious errors.

So skip the validation if CONFIG_ARM64_MODULE_PLTS is configured.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/kernel/ftrace.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/kernel/ftrace.c b/arch/arm64/kernel/ftrace.c
index 40ad08ac569a..a8db6857cad6 100644
--- a/arch/arm64/kernel/ftrace.c
+++ b/arch/arm64/kernel/ftrace.c
@@ -84,12 +84,15 @@ int ftrace_make_nop(struct module *mod, struct dyn_ftrace *rec,
 		    unsigned long addr)
 {
 	unsigned long pc = rec->ip;
-	u32 old, new;
+	u32 old = 0, new;
 
-	old = aarch64_insn_gen_branch_imm(pc, addr, AARCH64_INSN_BRANCH_LINK);
+	if (!IS_ENABLED(CONFIG_ARM64_MODULE_PLTS))
+		old = aarch64_insn_gen_branch_imm(pc, addr,
+						  AARCH64_INSN_BRANCH_LINK);
 	new = aarch64_insn_gen_nop();
 
-	return ftrace_modify_code(pc, old, new, true);
+	return ftrace_modify_code(pc, old, new,
+				  !IS_ENABLED(CONFIG_ARM64_MODULE_PLTS));
 }
 
 void arch_ftrace_update_code(int command)
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v2 2/2] arm64: ftrace: add support for far branches to dynamic ftrace
  2017-05-30 13:52 [PATCH v2 0/2] arm64: ftrace: fix interop issues with module PLTs Ard Biesheuvel
  2017-05-30 13:52 ` [PATCH v2 1/2] arm64: ftrace: don't validate branch via PLT in ftrace_make_nop() Ard Biesheuvel
@ 2017-05-30 13:52 ` Ard Biesheuvel
  2017-06-05 17:15   ` Will Deacon
  1 sibling, 1 reply; 11+ messages in thread
From: Ard Biesheuvel @ 2017-05-30 13:52 UTC (permalink / raw)
  To: linux-arm-kernel

Currently, dynamic ftrace support in the arm64 kernel assumes that all
core kernel code is within range of ordinary branch instructions in that
occur in module code, which is usually the case, but is no longer guaranteed
now that we have support for module PLTs and address space randomization.

Since on arm64, all patching of branch instructions involves function calls
to the same entry point [ftrace_caller()], we can emit the modules with a
trampoline that has unlimited range, and patch both the trampoline itself
and the branch instruction to redirect the call via the trampoline.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/Kconfig              |  2 +-
 arch/arm64/Makefile             |  3 ++
 arch/arm64/include/asm/module.h |  3 ++
 arch/arm64/kernel/Makefile      |  3 ++
 arch/arm64/kernel/ftrace-mod.S  | 18 +++++++
 arch/arm64/kernel/ftrace.c      | 52 ++++++++++++++++++--
 arch/arm64/kernel/module-plts.c | 10 ++++
 7 files changed, 87 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 3dcd7ec69bca..22f769b254b4 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -982,7 +982,7 @@ config RANDOMIZE_BASE
 
 config RANDOMIZE_MODULE_REGION_FULL
 	bool "Randomize the module region independently from the core kernel"
-	depends on RANDOMIZE_BASE && !DYNAMIC_FTRACE
+	depends on RANDOMIZE_BASE
 	default y
 	help
 	  Randomizes the location of the module region without considering the
diff --git a/arch/arm64/Makefile b/arch/arm64/Makefile
index f839ecd919f9..1ce57b42f390 100644
--- a/arch/arm64/Makefile
+++ b/arch/arm64/Makefile
@@ -70,6 +70,9 @@ endif
 
 ifeq ($(CONFIG_ARM64_MODULE_PLTS),y)
 KBUILD_LDFLAGS_MODULE	+= -T $(srctree)/arch/arm64/kernel/module.lds
+ifeq ($(CONFIG_DYNAMIC_FTRACE),y)
+KBUILD_LDFLAGS_MODULE	+= $(objtree)/arch/arm64/kernel/ftrace-mod.o
+endif
 endif
 
 # Default value
diff --git a/arch/arm64/include/asm/module.h b/arch/arm64/include/asm/module.h
index d57693f5d4ec..e931142c0e2a 100644
--- a/arch/arm64/include/asm/module.h
+++ b/arch/arm64/include/asm/module.h
@@ -30,6 +30,9 @@ struct mod_plt_sec {
 struct mod_arch_specific {
 	struct mod_plt_sec	core;
 	struct mod_plt_sec	init;
+
+	/* for CONFIG_DYNAMIC_FTRACE */
+	struct elf64_shdr	*ftrace_trampoline;
 };
 #endif
 
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index 1dcb69d3d0e5..f2b4e816b6de 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -62,3 +62,6 @@ extra-y					+= $(head-y) vmlinux.lds
 ifeq ($(CONFIG_DEBUG_EFI),y)
 AFLAGS_head.o += -DVMLINUX_PATH="\"$(realpath $(objtree)/vmlinux)\""
 endif
+
+# will be included by each individual module but not by the core kernel itself
+extra-$(CONFIG_DYNAMIC_FTRACE) += ftrace-mod.o
diff --git a/arch/arm64/kernel/ftrace-mod.S b/arch/arm64/kernel/ftrace-mod.S
new file mode 100644
index 000000000000..00c4025be4ff
--- /dev/null
+++ b/arch/arm64/kernel/ftrace-mod.S
@@ -0,0 +1,18 @@
+/*
+ * Copyright (C) 2017 Linaro Ltd <ard.biesheuvel@linaro.org>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/linkage.h>
+#include <asm/assembler.h>
+
+	.section	".text.ftrace_trampoline", "ax"
+	.align		3
+0:	.quad		0
+__ftrace_trampoline:
+	ldr		x16, 0b
+	br		x16
+ENDPROC(__ftrace_trampoline)
diff --git a/arch/arm64/kernel/ftrace.c b/arch/arm64/kernel/ftrace.c
index a8db6857cad6..d55151f0dc59 100644
--- a/arch/arm64/kernel/ftrace.c
+++ b/arch/arm64/kernel/ftrace.c
@@ -9,11 +9,14 @@
  * published by the Free Software Foundation.
  */
 
+#include <linux/elf.h>
 #include <linux/ftrace.h>
+#include <linux/module.h>
 #include <linux/swab.h>
 #include <linux/uaccess.h>
 
 #include <asm/cacheflush.h>
+#include <asm/debug-monitors.h>
 #include <asm/ftrace.h>
 #include <asm/insn.h>
 
@@ -63,6 +66,50 @@ int ftrace_update_ftrace_func(ftrace_func_t func)
 	return ftrace_modify_code(pc, 0, new, false);
 }
 
+static u32 __ftrace_gen_branch(unsigned long pc, unsigned long addr)
+{
+	long offset = (long)pc - (long)addr;
+	unsigned long *tramp;
+	struct module *mod;
+
+	if (IS_ENABLED(CONFIG_ARM64_MODULE_PLTS) &&
+	    (offset < -SZ_128M || offset >= SZ_128M)) {
+
+		/*
+		 * On kernels that support module PLTs, the offset between the
+		 * call and its target may legally exceed the range of an
+		 * ordinary branch instruction. In this case, we need to branch
+		 * via a trampoline in the module.
+		 */
+		mod = __module_address(pc);
+		if (WARN_ON(!mod))
+			return AARCH64_BREAK_FAULT;
+
+		/*
+		 * There is only one ftrace trampoline per module. For now,
+		 * this is not a problem since on arm64, all dynamic ftrace
+		 * invocations are routed via ftrace_caller(). This will need
+		 * to be revisited if support for multiple ftrace entry points
+		 * is added in the future, but for now, the pr_err() below
+		 * deals with a theoretical issue only.
+		 */
+		tramp = (unsigned long *)mod->arch.ftrace_trampoline->sh_addr;
+		if (tramp[0] != addr) {
+			if (tramp[0] != 0) {
+				pr_err("ftrace: far branches to multiple entry points unsupported inside a single module\n");
+				return AARCH64_BREAK_FAULT;
+			}
+
+			/* point the trampoline to our ftrace entry point */
+			module_disable_ro(mod);
+			tramp[0] = addr;
+			module_enable_ro(mod, true);
+		}
+		addr = (unsigned long)&tramp[1];
+	}
+	return aarch64_insn_gen_branch_imm(pc, addr, AARCH64_INSN_BRANCH_LINK);
+}
+
 /*
  * Turn on the call to ftrace_caller() in instrumented function
  */
@@ -72,7 +119,7 @@ int ftrace_make_call(struct dyn_ftrace *rec, unsigned long addr)
 	u32 old, new;
 
 	old = aarch64_insn_gen_nop();
-	new = aarch64_insn_gen_branch_imm(pc, addr, AARCH64_INSN_BRANCH_LINK);
+	new = __ftrace_gen_branch(pc, addr);
 
 	return ftrace_modify_code(pc, old, new, true);
 }
@@ -87,8 +134,7 @@ int ftrace_make_nop(struct module *mod, struct dyn_ftrace *rec,
 	u32 old = 0, new;
 
 	if (!IS_ENABLED(CONFIG_ARM64_MODULE_PLTS))
-		old = aarch64_insn_gen_branch_imm(pc, addr,
-						  AARCH64_INSN_BRANCH_LINK);
+		old = __ftrace_gen_branch(pc, addr);
 	new = aarch64_insn_gen_nop();
 
 	return ftrace_modify_code(pc, old, new,
diff --git a/arch/arm64/kernel/module-plts.c b/arch/arm64/kernel/module-plts.c
index d05dbe658409..b469c504d688 100644
--- a/arch/arm64/kernel/module-plts.c
+++ b/arch/arm64/kernel/module-plts.c
@@ -167,6 +167,10 @@ int module_frob_arch_sections(Elf_Ehdr *ehdr, Elf_Shdr *sechdrs,
 			mod->arch.init.plt = sechdrs + i;
 		else if (sechdrs[i].sh_type == SHT_SYMTAB)
 			syms = (Elf64_Sym *)sechdrs[i].sh_addr;
+		else if (IS_ENABLED(CONFIG_DYNAMIC_FTRACE) &&
+			 strcmp(".text.ftrace_trampoline",
+				secstrings + sechdrs[i].sh_name) == 0)
+			mod->arch.ftrace_trampoline = sechdrs + i;
 	}
 
 	if (!mod->arch.core.plt || !mod->arch.init.plt) {
@@ -178,6 +182,12 @@ int module_frob_arch_sections(Elf_Ehdr *ehdr, Elf_Shdr *sechdrs,
 		return -ENOEXEC;
 	}
 
+	if (IS_ENABLED(CONFIG_DYNAMIC_FTRACE) && !mod->arch.ftrace_trampoline) {
+		pr_err("%s: module ftrace trampoline section missing\n",
+		       mod->name);
+		return -ENOEXEC;
+	}
+
 	for (i = 0; i < ehdr->e_shnum; i++) {
 		Elf64_Rela *rels = (void *)ehdr + sechdrs[i].sh_offset;
 		int numrels = sechdrs[i].sh_size / sizeof(Elf64_Rela);
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v2 2/2] arm64: ftrace: add support for far branches to dynamic ftrace
  2017-05-30 13:52 ` [PATCH v2 2/2] arm64: ftrace: add support for far branches to dynamic ftrace Ard Biesheuvel
@ 2017-06-05 17:15   ` Will Deacon
  2017-06-07 15:46     ` Steven Rostedt
  0 siblings, 1 reply; 11+ messages in thread
From: Will Deacon @ 2017-06-05 17:15 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Ard,

Thanks for posting this.

On Tue, May 30, 2017 at 01:52:20PM +0000, Ard Biesheuvel wrote:
> Currently, dynamic ftrace support in the arm64 kernel assumes that all
> core kernel code is within range of ordinary branch instructions in that
> occur in module code, which is usually the case, but is no longer guaranteed
> now that we have support for module PLTs and address space randomization.
> 
> Since on arm64, all patching of branch instructions involves function calls
> to the same entry point [ftrace_caller()], we can emit the modules with a
> trampoline that has unlimited range, and patch both the trampoline itself
> and the branch instruction to redirect the call via the trampoline.
> 
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>

--->8

> diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
> index 1dcb69d3d0e5..f2b4e816b6de 100644
> --- a/arch/arm64/kernel/Makefile
> +++ b/arch/arm64/kernel/Makefile
> @@ -62,3 +62,6 @@ extra-y					+= $(head-y) vmlinux.lds
>  ifeq ($(CONFIG_DEBUG_EFI),y)
>  AFLAGS_head.o += -DVMLINUX_PATH="\"$(realpath $(objtree)/vmlinux)\""
>  endif
> +
> +# will be included by each individual module but not by the core kernel itself
> +extra-$(CONFIG_DYNAMIC_FTRACE) += ftrace-mod.o
> diff --git a/arch/arm64/kernel/ftrace-mod.S b/arch/arm64/kernel/ftrace-mod.S
> new file mode 100644
> index 000000000000..00c4025be4ff
> --- /dev/null
> +++ b/arch/arm64/kernel/ftrace-mod.S
> @@ -0,0 +1,18 @@
> +/*
> + * Copyright (C) 2017 Linaro Ltd <ard.biesheuvel@linaro.org>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + */
> +
> +#include <linux/linkage.h>
> +#include <asm/assembler.h>
> +
> +	.section	".text.ftrace_trampoline", "ax"
> +	.align		3
> +0:	.quad		0
> +__ftrace_trampoline:
> +	ldr		x16, 0b
> +	br		x16

[...]

> +static u32 __ftrace_gen_branch(unsigned long pc, unsigned long addr)
> +{
> +	long offset = (long)pc - (long)addr;
> +	unsigned long *tramp;
> +	struct module *mod;
> +
> +	if (IS_ENABLED(CONFIG_ARM64_MODULE_PLTS) &&
> +	    (offset < -SZ_128M || offset >= SZ_128M)) {
> +
> +		/*
> +		 * On kernels that support module PLTs, the offset between the
> +		 * call and its target may legally exceed the range of an
> +		 * ordinary branch instruction. In this case, we need to branch
> +		 * via a trampoline in the module.
> +		 */
> +		mod = __module_address(pc);
> +		if (WARN_ON(!mod))
> +			return AARCH64_BREAK_FAULT;
> +
> +		/*
> +		 * There is only one ftrace trampoline per module. For now,
> +		 * this is not a problem since on arm64, all dynamic ftrace
> +		 * invocations are routed via ftrace_caller(). This will need
> +		 * to be revisited if support for multiple ftrace entry points
> +		 * is added in the future, but for now, the pr_err() below
> +		 * deals with a theoretical issue only.
> +		 */
> +		tramp = (unsigned long *)mod->arch.ftrace_trampoline->sh_addr;
> +		if (tramp[0] != addr) {
> +			if (tramp[0] != 0) {
> +				pr_err("ftrace: far branches to multiple entry points unsupported inside a single module\n");
> +				return AARCH64_BREAK_FAULT;
> +			}
> +
> +			/* point the trampoline to our ftrace entry point */
> +			module_disable_ro(mod);
> +			tramp[0] = addr;
> +			module_enable_ro(mod, true);

I'm not sure what the barrier semantics are for module_enable_ro, but I'd be
inclined to stick in a smp_wmb() here to order the write of the trampoline
data before the writing of the branch instruction.

Will

> +		}
> +		addr = (unsigned long)&tramp[1];
> +	}
> +	return aarch64_insn_gen_branch_imm(pc, addr, AARCH64_INSN_BRANCH_LINK);
> +}
> +

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v2 1/2] arm64: ftrace: don't validate branch via PLT in ftrace_make_nop()
  2017-05-30 13:52 ` [PATCH v2 1/2] arm64: ftrace: don't validate branch via PLT in ftrace_make_nop() Ard Biesheuvel
@ 2017-06-05 17:15   ` Will Deacon
  2017-06-05 17:26     ` Ard Biesheuvel
  0 siblings, 1 reply; 11+ messages in thread
From: Will Deacon @ 2017-06-05 17:15 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, May 30, 2017 at 01:52:19PM +0000, Ard Biesheuvel wrote:
> When turning branch instructions into NOPs, we attempt to validate the
> action by comparing the old value at the call site with the opcode of
> a direct relative branch instruction pointing at the old target.
> 
> However, these call sites are statically initialized to call _mcount(),
> and may be redirected via a PLT entry if the module is loaded far away
> from the kernel text, leading to false negatives and spurious errors.
> 
> So skip the validation if CONFIG_ARM64_MODULE_PLTS is configured.
> 
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> ---
>  arch/arm64/kernel/ftrace.c | 9 ++++++---
>  1 file changed, 6 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/arm64/kernel/ftrace.c b/arch/arm64/kernel/ftrace.c
> index 40ad08ac569a..a8db6857cad6 100644
> --- a/arch/arm64/kernel/ftrace.c
> +++ b/arch/arm64/kernel/ftrace.c
> @@ -84,12 +84,15 @@ int ftrace_make_nop(struct module *mod, struct dyn_ftrace *rec,
>  		    unsigned long addr)
>  {
>  	unsigned long pc = rec->ip;
> -	u32 old, new;
> +	u32 old = 0, new;
>  
> -	old = aarch64_insn_gen_branch_imm(pc, addr, AARCH64_INSN_BRANCH_LINK);
> +	if (!IS_ENABLED(CONFIG_ARM64_MODULE_PLTS))
> +		old = aarch64_insn_gen_branch_imm(pc, addr,
> +						  AARCH64_INSN_BRANCH_LINK);
>  	new = aarch64_insn_gen_nop();
>  
> -	return ftrace_modify_code(pc, old, new, true);
> +	return ftrace_modify_code(pc, old, new,
> +				  !IS_ENABLED(CONFIG_ARM64_MODULE_PLTS));

Even in the case of PLTs, the instruction is still a branch instruction,
so could we validate the basic opcode fields rather than disable the check
altogether?

Will

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v2 1/2] arm64: ftrace: don't validate branch via PLT in ftrace_make_nop()
  2017-06-05 17:15   ` Will Deacon
@ 2017-06-05 17:26     ` Ard Biesheuvel
  0 siblings, 0 replies; 11+ messages in thread
From: Ard Biesheuvel @ 2017-06-05 17:26 UTC (permalink / raw)
  To: linux-arm-kernel

On 5 June 2017 at 17:15, Will Deacon <will.deacon@arm.com> wrote:
> On Tue, May 30, 2017 at 01:52:19PM +0000, Ard Biesheuvel wrote:
>> When turning branch instructions into NOPs, we attempt to validate the
>> action by comparing the old value at the call site with the opcode of
>> a direct relative branch instruction pointing at the old target.
>>
>> However, these call sites are statically initialized to call _mcount(),
>> and may be redirected via a PLT entry if the module is loaded far away
>> from the kernel text, leading to false negatives and spurious errors.
>>
>> So skip the validation if CONFIG_ARM64_MODULE_PLTS is configured.
>>
>> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
>> ---
>>  arch/arm64/kernel/ftrace.c | 9 ++++++---
>>  1 file changed, 6 insertions(+), 3 deletions(-)
>>
>> diff --git a/arch/arm64/kernel/ftrace.c b/arch/arm64/kernel/ftrace.c
>> index 40ad08ac569a..a8db6857cad6 100644
>> --- a/arch/arm64/kernel/ftrace.c
>> +++ b/arch/arm64/kernel/ftrace.c
>> @@ -84,12 +84,15 @@ int ftrace_make_nop(struct module *mod, struct dyn_ftrace *rec,
>>                   unsigned long addr)
>>  {
>>       unsigned long pc = rec->ip;
>> -     u32 old, new;
>> +     u32 old = 0, new;
>>
>> -     old = aarch64_insn_gen_branch_imm(pc, addr, AARCH64_INSN_BRANCH_LINK);
>> +     if (!IS_ENABLED(CONFIG_ARM64_MODULE_PLTS))
>> +             old = aarch64_insn_gen_branch_imm(pc, addr,
>> +                                               AARCH64_INSN_BRANCH_LINK);
>>       new = aarch64_insn_gen_nop();
>>
>> -     return ftrace_modify_code(pc, old, new, true);
>> +     return ftrace_modify_code(pc, old, new,
>> +                               !IS_ENABLED(CONFIG_ARM64_MODULE_PLTS));
>
> Even in the case of PLTs, the instruction is still a branch instruction,
> so could we validate the basic opcode fields rather than disable the check
> altogether?
>

That shouldn't be too hard.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v2 2/2] arm64: ftrace: add support for far branches to dynamic ftrace
  2017-06-05 17:15   ` Will Deacon
@ 2017-06-07 15:46     ` Steven Rostedt
  2017-06-07 15:50       ` Ard Biesheuvel
  0 siblings, 1 reply; 11+ messages in thread
From: Steven Rostedt @ 2017-06-07 15:46 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, 5 Jun 2017 18:15:35 +0100
Will Deacon <will.deacon@arm.com> wrote:


> > +		tramp = (unsigned long *)mod->arch.ftrace_trampoline->sh_addr;
> > +		if (tramp[0] != addr) {
> > +			if (tramp[0] != 0) {
> > +				pr_err("ftrace: far branches to multiple entry points unsupported inside a single module\n");
> > +				return AARCH64_BREAK_FAULT;
> > +			}
> > +
> > +			/* point the trampoline to our ftrace entry point */
> > +			module_disable_ro(mod);
> > +			tramp[0] = addr;
> > +			module_enable_ro(mod, true);  
> 
> I'm not sure what the barrier semantics are for module_enable_ro, but I'd be
> inclined to stick in a smp_wmb() here to order the write of the trampoline
> data before the writing of the branch instruction.

I would assume that module_disable/enable_ro() has proper barriers for
modifying the page tables with respect to code around it, otherwise it
would probably be an issues elsewhere in the kernel. Specifically in
the module code itself.

I don't see how a smp_wmb() would be needed here, especially since this
is serialized code, and not something done by multiple CPUs.

-- Steve

> 
> Will
> 
> > +		}
> > +		addr = (unsigned long)&tramp[1];
> > +	}
> > +	return aarch64_insn_gen_branch_imm(pc, addr, AARCH64_INSN_BRANCH_LINK);
> > +}
> > +  

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v2 2/2] arm64: ftrace: add support for far branches to dynamic ftrace
  2017-06-07 15:46     ` Steven Rostedt
@ 2017-06-07 15:50       ` Ard Biesheuvel
  2017-06-07 16:56         ` Will Deacon
  0 siblings, 1 reply; 11+ messages in thread
From: Ard Biesheuvel @ 2017-06-07 15:50 UTC (permalink / raw)
  To: linux-arm-kernel

On 7 June 2017 at 15:46, Steven Rostedt <rostedt@goodmis.org> wrote:
> On Mon, 5 Jun 2017 18:15:35 +0100
> Will Deacon <will.deacon@arm.com> wrote:
>
>
>> > +           tramp = (unsigned long *)mod->arch.ftrace_trampoline->sh_addr;
>> > +           if (tramp[0] != addr) {
>> > +                   if (tramp[0] != 0) {
>> > +                           pr_err("ftrace: far branches to multiple entry points unsupported inside a single module\n");
>> > +                           return AARCH64_BREAK_FAULT;
>> > +                   }
>> > +
>> > +                   /* point the trampoline to our ftrace entry point */
>> > +                   module_disable_ro(mod);
>> > +                   tramp[0] = addr;
>> > +                   module_enable_ro(mod, true);
>>
>> I'm not sure what the barrier semantics are for module_enable_ro, but I'd be
>> inclined to stick in a smp_wmb() here to order the write of the trampoline
>> data before the writing of the branch instruction.
>
> I would assume that module_disable/enable_ro() has proper barriers for
> modifying the page tables with respect to code around it, otherwise it
> would probably be an issues elsewhere in the kernel. Specifically in
> the module code itself.
>
> I don't see how a smp_wmb() would be needed here, especially since this
> is serialized code, and not something done by multiple CPUs.
>

But other cores could be invoking the function we are patching here,
no? So when such a core observes (and executes) the updated
instruction before it observes the updated target field of the
trampoline, it will branch to address 0x0.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v2 2/2] arm64: ftrace: add support for far branches to dynamic ftrace
  2017-06-07 15:50       ` Ard Biesheuvel
@ 2017-06-07 16:56         ` Will Deacon
  2017-06-07 19:02           ` Steven Rostedt
  0 siblings, 1 reply; 11+ messages in thread
From: Will Deacon @ 2017-06-07 16:56 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jun 07, 2017 at 03:50:10PM +0000, Ard Biesheuvel wrote:
> On 7 June 2017 at 15:46, Steven Rostedt <rostedt@goodmis.org> wrote:
> > On Mon, 5 Jun 2017 18:15:35 +0100
> > Will Deacon <will.deacon@arm.com> wrote:
> >
> >
> >> > +           tramp = (unsigned long *)mod->arch.ftrace_trampoline->sh_addr;
> >> > +           if (tramp[0] != addr) {
> >> > +                   if (tramp[0] != 0) {
> >> > +                           pr_err("ftrace: far branches to multiple entry points unsupported inside a single module\n");
> >> > +                           return AARCH64_BREAK_FAULT;
> >> > +                   }
> >> > +
> >> > +                   /* point the trampoline to our ftrace entry point */
> >> > +                   module_disable_ro(mod);
> >> > +                   tramp[0] = addr;
> >> > +                   module_enable_ro(mod, true);
> >>
> >> I'm not sure what the barrier semantics are for module_enable_ro, but I'd be
> >> inclined to stick in a smp_wmb() here to order the write of the trampoline
> >> data before the writing of the branch instruction.
> >
> > I would assume that module_disable/enable_ro() has proper barriers for
> > modifying the page tables with respect to code around it, otherwise it
> > would probably be an issues elsewhere in the kernel. Specifically in
> > the module code itself.
> >
> > I don't see how a smp_wmb() would be needed here, especially since this
> > is serialized code, and not something done by multiple CPUs.
> >
> 
> But other cores could be invoking the function we are patching here,
> no? So when such a core observes (and executes) the updated
> instruction before it observes the updated target field of the
> trampoline, it will branch to address 0x0.

I think Steve's saying that the TLB invalidation buried in the set_memory_*
functions called by module_enable_ro will have sufficient barriers to order
the write of the trampoline with respect to writing the branch instruction,
but this isn't quite right because module_enable_ro is affected by things
like CONFIG_STRICT_MODULE_RWX and rodata_enabled.

Will

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v2 2/2] arm64: ftrace: add support for far branches to dynamic ftrace
  2017-06-07 16:56         ` Will Deacon
@ 2017-06-07 19:02           ` Steven Rostedt
  2017-06-08  9:59             ` Will Deacon
  0 siblings, 1 reply; 11+ messages in thread
From: Steven Rostedt @ 2017-06-07 19:02 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 7 Jun 2017 17:56:26 +0100
Will Deacon <will.deacon@arm.com> wrote:

> On Wed, Jun 07, 2017 at 03:50:10PM +0000, Ard Biesheuvel wrote:
> > On 7 June 2017 at 15:46, Steven Rostedt <rostedt@goodmis.org> wrote:  
> > > On Mon, 5 Jun 2017 18:15:35 +0100
> > > Will Deacon <will.deacon@arm.com> wrote:
> > >
> > >  
> > >> > +           tramp = (unsigned long *)mod->arch.ftrace_trampoline->sh_addr;
> > >> > +           if (tramp[0] != addr) {
> > >> > +                   if (tramp[0] != 0) {
> > >> > +                           pr_err("ftrace: far branches to multiple entry points unsupported inside a single module\n");
> > >> > +                           return AARCH64_BREAK_FAULT;
> > >> > +                   }
> > >> > +
> > >> > +                   /* point the trampoline to our ftrace entry point */
> > >> > +                   module_disable_ro(mod);
> > >> > +                   tramp[0] = addr;
> > >> > +                   module_enable_ro(mod, true);  
> > >>
> > >> I'm not sure what the barrier semantics are for module_enable_ro, but I'd be
> > >> inclined to stick in a smp_wmb() here to order the write of the trampoline
> > >> data before the writing of the branch instruction.  
> > >
> > > I would assume that module_disable/enable_ro() has proper barriers for
> > > modifying the page tables with respect to code around it, otherwise it
> > > would probably be an issues elsewhere in the kernel. Specifically in
> > > the module code itself.
> > >
> > > I don't see how a smp_wmb() would be needed here, especially since this
> > > is serialized code, and not something done by multiple CPUs.
> > >  
> > 
> > But other cores could be invoking the function we are patching here,
> > no? So when such a core observes (and executes) the updated
> > instruction before it observes the updated target field of the
> > trampoline, it will branch to address 0x0.  
> 
> I think Steve's saying that the TLB invalidation buried in the set_memory_*
> functions called by module_enable_ro will have sufficient barriers to order
> the write of the trampoline with respect to writing the branch instruction,
> but this isn't quite right because module_enable_ro is affected by things
> like CONFIG_STRICT_MODULE_RWX and rodata_enabled.
> 

Sorry, I was thinking that you were worried about the update with the
page tables taking affect.

But from the above comment by Ard, I'm now thinking you are worried
about the jump to the trampoline itself when the branch is enabled. Now,
my question is, can a smp_wmb() be good enough as there's no smp_rmb()
to accompany it. Even with an smp_wmb() the reads can still come out of
order without any smp_rmb() on the execution path.

The way I get around that on x86 is I call on_each_cpu() a sync_core()
routine. See run_sync() in arch/x86/kernel/ftrace.c. That makes sure
that all cores are updated before I proceed to the next step.

-- Steve

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v2 2/2] arm64: ftrace: add support for far branches to dynamic ftrace
  2017-06-07 19:02           ` Steven Rostedt
@ 2017-06-08  9:59             ` Will Deacon
  0 siblings, 0 replies; 11+ messages in thread
From: Will Deacon @ 2017-06-08  9:59 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Steve,

On Wed, Jun 07, 2017 at 03:02:01PM -0400, Steven Rostedt wrote:
> On Wed, 7 Jun 2017 17:56:26 +0100
> Will Deacon <will.deacon@arm.com> wrote:
> 
> > On Wed, Jun 07, 2017 at 03:50:10PM +0000, Ard Biesheuvel wrote:
> > > On 7 June 2017 at 15:46, Steven Rostedt <rostedt@goodmis.org> wrote:  
> > > > On Mon, 5 Jun 2017 18:15:35 +0100
> > > > Will Deacon <will.deacon@arm.com> wrote:
> > > >
> > > >  
> > > >> > +           tramp = (unsigned long *)mod->arch.ftrace_trampoline->sh_addr;
> > > >> > +           if (tramp[0] != addr) {
> > > >> > +                   if (tramp[0] != 0) {
> > > >> > +                           pr_err("ftrace: far branches to multiple entry points unsupported inside a single module\n");
> > > >> > +                           return AARCH64_BREAK_FAULT;
> > > >> > +                   }
> > > >> > +
> > > >> > +                   /* point the trampoline to our ftrace entry point */
> > > >> > +                   module_disable_ro(mod);
> > > >> > +                   tramp[0] = addr;
> > > >> > +                   module_enable_ro(mod, true);  
> > > >>
> > > >> I'm not sure what the barrier semantics are for module_enable_ro, but I'd be
> > > >> inclined to stick in a smp_wmb() here to order the write of the trampoline
> > > >> data before the writing of the branch instruction.  
> > > >
> > > > I would assume that module_disable/enable_ro() has proper barriers for
> > > > modifying the page tables with respect to code around it, otherwise it
> > > > would probably be an issues elsewhere in the kernel. Specifically in
> > > > the module code itself.
> > > >
> > > > I don't see how a smp_wmb() would be needed here, especially since this
> > > > is serialized code, and not something done by multiple CPUs.
> > > >  
> > > 
> > > But other cores could be invoking the function we are patching here,
> > > no? So when such a core observes (and executes) the updated
> > > instruction before it observes the updated target field of the
> > > trampoline, it will branch to address 0x0.  
> > 
> > I think Steve's saying that the TLB invalidation buried in the set_memory_*
> > functions called by module_enable_ro will have sufficient barriers to order
> > the write of the trampoline with respect to writing the branch instruction,
> > but this isn't quite right because module_enable_ro is affected by things
> > like CONFIG_STRICT_MODULE_RWX and rodata_enabled.
> > 
> 
> Sorry, I was thinking that you were worried about the update with the
> page tables taking affect.
> 
> But from the above comment by Ard, I'm now thinking you are worried
> about the jump to the trampoline itself when the branch is enabled. Now,
> my question is, can a smp_wmb() be good enough as there's no smp_rmb()
> to accompany it. Even with an smp_wmb() the reads can still come out of
> order without any smp_rmb() on the execution path.

So this is actually really subtle and the architecture manual doesn't
give us *quite* what we need. The reads that we're trying to order are:

  1. The instruction fetch of the branch instruction
  2. The load of the offset for the indirect branch in the trampoline

You might think there's a strong dependency here because the second load
doesn't take place if the branch isn't taken, but branch prediction could
in theory break that assumption.

Now, having a branch predictor predict on a NOP is extremely unlikely,
so I'll see if I can get the architecture clarified here. In the meantime,
I've merged this patch because I'm pretty confident it will work in practice
and if the architecture really won't give us the guarantee then we'll have
other things to fix too (like the whole applicability of the _nosync
patching).

Will

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2017-06-08  9:59 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-30 13:52 [PATCH v2 0/2] arm64: ftrace: fix interop issues with module PLTs Ard Biesheuvel
2017-05-30 13:52 ` [PATCH v2 1/2] arm64: ftrace: don't validate branch via PLT in ftrace_make_nop() Ard Biesheuvel
2017-06-05 17:15   ` Will Deacon
2017-06-05 17:26     ` Ard Biesheuvel
2017-05-30 13:52 ` [PATCH v2 2/2] arm64: ftrace: add support for far branches to dynamic ftrace Ard Biesheuvel
2017-06-05 17:15   ` Will Deacon
2017-06-07 15:46     ` Steven Rostedt
2017-06-07 15:50       ` Ard Biesheuvel
2017-06-07 16:56         ` Will Deacon
2017-06-07 19:02           ` Steven Rostedt
2017-06-08  9:59             ` Will Deacon

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.