linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC][PATCH] x86/ftrace: Have ftrace trampolines turn read-only at the end of system boot up
@ 2020-05-01  0:21 Steven Rostedt
  2020-05-01  4:47 ` Josh Poimboeuf
  0 siblings, 1 reply; 7+ messages in thread
From: Steven Rostedt @ 2020-05-01  0:21 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Ingo Molnar, Thomas Gleixner, Andy Lutomirski,
	Borislav Petkov, Josh Poimboeuf, H. Peter Anvin

From: Steven Rostedt (VMware) <rostedt@goodmis.org>

Booting one of my machines, it triggered the following crash:

 Kernel/User page tables isolation: enabled
 ftrace: allocating 36577 entries in 143 pages
 Starting tracer 'function'
 BUG: unable to handle page fault for address: ffffffffa000005c
 #PF: supervisor write access in kernel mode
 #PF: error_code(0x0003) - permissions violation
 PGD 2014067 P4D 2014067 PUD 2015063 PMD 7b253067 PTE 7b252061
 Oops: 0003 [#1] PREEMPT SMP PTI
 CPU: 0 PID: 0 Comm: swapper Not tainted 5.4.0-test+ #24
 Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./To be filled by O.E.M., BIOS SDBLI944.86P 05/08/2007
 RIP: 0010:text_poke_early+0x4a/0x58
 Code: 34 24 48 89 54 24 08 e8 bf 72 0b 00 48 8b 34 24 48 8b 4c 24 08 84 c0 74 0b 48 89 df f3 a4 48 83 c4 10 5b c3 9c 58 fa 48 89 df <f3> a4 50 9d 48 83 c4 10 5b e9 d6 f9 ff ff
0 41 57 49
 RSP: 0000:ffffffff82003d38 EFLAGS: 00010046
 RAX: 0000000000000046 RBX: ffffffffa000005c RCX: 0000000000000005
 RDX: 0000000000000005 RSI: ffffffff825b9a90 RDI: ffffffffa000005c
 RBP: ffffffffa000005c R08: 0000000000000000 R09: ffffffff8206e6e0
 R10: ffff88807b01f4c0 R11: ffffffff8176c106 R12: ffffffff8206e6e0
 R13: ffffffff824f2440 R14: 0000000000000000 R15: ffffffff8206eac0
 FS:  0000000000000000(0000) GS:ffff88807d400000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: ffffffffa000005c CR3: 0000000002012000 CR4: 00000000000006b0
 Call Trace:
  text_poke_bp+0x27/0x64
  ? mutex_lock+0x36/0x5d
  arch_ftrace_update_trampoline+0x287/0x2d5
  ? ftrace_replace_code+0x14b/0x160
  ? ftrace_update_ftrace_func+0x65/0x6c
  __register_ftrace_function+0x6d/0x81
  ftrace_startup+0x23/0xc1
  register_ftrace_function+0x20/0x37
  func_set_flag+0x59/0x77
  __set_tracer_option.isra.19+0x20/0x3e
  trace_set_options+0xd6/0x13e
  apply_trace_boot_options+0x44/0x6d
  register_tracer+0x19e/0x1ac
  early_trace_init+0x21b/0x2c9
  start_kernel+0x241/0x518
  ? load_ucode_intel_bsp+0x21/0x52
  secondary_startup_64+0xa4/0xb0

I was able to trigger it on other machines, when I added to the kernel
command line of both "ftrace=function" and "trace_options=func_stack_trace".

The cause is the "ftrace=function" would register the function tracer
and create a trampoline, and it will set it as executable and
read-only. Then the "trace_options=func_stack_trace" would then update
the same trampoline to include the stack tracer version of the function
tracer. But since the trampoline already exists, it updates it with
text_poke_bp(). The problem is that text_poke_bp() called while
system_state == SYSTEM_BOOTING, it will simply do a memcpy() and not
the page mapping, as it would think that the text is still read-write.
But in this case it is not, and we take a fault and crash.

Instead, lets keep the ftrace trampolines read-write during boot up,
and then when the kernel executable text is set to read-only, the
ftrace trampolines get set to read-only as well.

Cc: stable@vger.kernel.org
Fixes: 768ae4406a5c ("x86/ftrace: Use text_poke()")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
diff --git a/arch/x86/include/asm/ftrace.h b/arch/x86/include/asm/ftrace.h
index dbd9b08bf173..e14f792b106c 100644
--- a/arch/x86/include/asm/ftrace.h
+++ b/arch/x86/include/asm/ftrace.h
@@ -43,6 +43,12 @@ struct dyn_arch_ftrace {
 
 #ifndef __ASSEMBLY__
 
+#ifdef CONFIG_FUNCTION_TRACER
+extern void set_ftrace_ops_ro(void);
+#else
+static inline void set_ftrace_ops_ro(void) { }
+#endif
+
 #define ARCH_HAS_SYSCALL_MATCH_SYM_NAME
 static inline bool arch_syscall_match_sym_name(const char *sym, const char *name)
 {
diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
index 3d8adeba2651..dd30ba1e0244 100644
--- a/arch/x86/kernel/ftrace.c
+++ b/arch/x86/kernel/ftrace.c
@@ -419,7 +419,8 @@ create_trampoline(struct ftrace_ops *ops, unsigned int *tramp_size)
 
 	set_vm_flush_reset_perms(trampoline);
 
-	set_memory_ro((unsigned long)trampoline, npages);
+	if (likely(system_state != SYSTEM_BOOTING))
+		set_memory_ro((unsigned long)trampoline, npages);
 	set_memory_x((unsigned long)trampoline, npages);
 	return (unsigned long)trampoline;
 fail:
@@ -427,6 +428,32 @@ create_trampoline(struct ftrace_ops *ops, unsigned int *tramp_size)
 	return 0;
 }
 
+void set_ftrace_ops_ro(void)
+{
+	struct ftrace_ops *ops;
+	unsigned long start_offset;
+	unsigned long end_offset;
+	unsigned long npages;
+	unsigned long size;
+
+	do_for_each_ftrace_op(ops, ftrace_ops_list) {
+		if (!(ops->flags & FTRACE_OPS_FL_ALLOC_TRAMP))
+			continue;
+
+		if (ops->flags & FTRACE_OPS_FL_SAVE_REGS) {
+			start_offset = (unsigned long)ftrace_regs_caller;
+			end_offset = (unsigned long)ftrace_regs_caller_end;
+		} else {
+			start_offset = (unsigned long)ftrace_caller;
+			end_offset = (unsigned long)ftrace_epilogue;
+		}
+		size = end_offset - start_offset;
+		size = size + RET_SIZE + sizeof(void *);
+		npages = DIV_ROUND_UP(size, PAGE_SIZE);
+		set_memory_ro((unsigned long)ops->trampoline, npages);
+	} while_for_each_ftrace_op(ops);
+}
+
 static unsigned long calc_trampoline_call_offset(bool save_regs)
 {
 	unsigned long start_offset;
diff --git a/arch/x86/mm/init_32.c b/arch/x86/mm/init_32.c
index 930edeb41ec3..75d5d2e591ea 100644
--- a/arch/x86/mm/init_32.c
+++ b/arch/x86/mm/init_32.c
@@ -52,6 +52,7 @@
 #include <asm/page_types.h>
 #include <asm/cpu_entry_area.h>
 #include <asm/init.h>
+#include <asm/ftrace.h>
 
 #include "mm_internal.h"
 
@@ -930,6 +931,8 @@ void mark_rodata_ro(void)
 
 	kernel_set_to_readonly = 1;
 
+	set_ftrace_ops_ro();
+
 #ifdef CONFIG_CPA_DEBUG
 	pr_info("Testing CPA: Reverting %lx-%lx\n", start, start + size);
 	set_pages_rw(virt_to_page(start), size >> PAGE_SHIFT);
diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index dcb9bc961b39..d36da393a947 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -54,6 +54,7 @@
 #include <asm/init.h>
 #include <asm/uv/uv.h>
 #include <asm/setup.h>
+#include <asm/ftrace.h>
 
 #include "mm_internal.h"
 
@@ -1326,6 +1327,8 @@ void mark_rodata_ro(void)
 	all_end = roundup((unsigned long)_brk_end, PMD_SIZE);
 	set_memory_nx(text_end, (all_end - text_end) >> PAGE_SHIFT);
 
+	set_ftrace_ops_ro();
+
 #ifdef CONFIG_CPA_DEBUG
 	printk(KERN_INFO "Testing CPA: undo %lx-%lx\n", start, end);
 	set_memory_rw(start, (end-start) >> PAGE_SHIFT);
diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index 9141f2263286..f97626cbfbdf 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -203,6 +203,29 @@ struct ftrace_ops {
 #endif
 };
 
+extern struct ftrace_ops __rcu *ftrace_ops_list;
+extern struct ftrace_ops ftrace_list_end;
+
+/*
+ * Traverse the ftrace_global_list, invoking all entries.  The reason that we
+ * can use rcu_dereference_raw_check() is that elements removed from this list
+ * are simply leaked, so there is no need to interact with a grace-period
+ * mechanism.  The rcu_dereference_raw_check() calls are needed to handle
+ * concurrent insertions into the ftrace_global_list.
+ *
+ * Silly Alpha and silly pointer-speculation compiler optimizations!
+ */
+#define do_for_each_ftrace_op(op, list)			\
+	op = rcu_dereference_raw_check(list);			\
+	do
+
+/*
+ * Optimized for just a single item in the list (as that is the normal case).
+ */
+#define while_for_each_ftrace_op(op)				\
+	while (likely(op = rcu_dereference_raw_check((op)->next)) &&	\
+	       unlikely((op) != &ftrace_list_end))
+
 /*
  * Type of the current tracing.
  */
diff --git a/kernel/trace/ftrace_internal.h b/kernel/trace/ftrace_internal.h
index 0456e0a3dab1..382775edf690 100644
--- a/kernel/trace/ftrace_internal.h
+++ b/kernel/trace/ftrace_internal.h
@@ -4,28 +4,6 @@
 
 #ifdef CONFIG_FUNCTION_TRACER
 
-/*
- * Traverse the ftrace_global_list, invoking all entries.  The reason that we
- * can use rcu_dereference_raw_check() is that elements removed from this list
- * are simply leaked, so there is no need to interact with a grace-period
- * mechanism.  The rcu_dereference_raw_check() calls are needed to handle
- * concurrent insertions into the ftrace_global_list.
- *
- * Silly Alpha and silly pointer-speculation compiler optimizations!
- */
-#define do_for_each_ftrace_op(op, list)			\
-	op = rcu_dereference_raw_check(list);			\
-	do
-
-/*
- * Optimized for just a single item in the list (as that is the normal case).
- */
-#define while_for_each_ftrace_op(op)				\
-	while (likely(op = rcu_dereference_raw_check((op)->next)) &&	\
-	       unlikely((op) != &ftrace_list_end))
-
-extern struct ftrace_ops __rcu *ftrace_ops_list;
-extern struct ftrace_ops ftrace_list_end;
 extern struct mutex ftrace_lock;
 extern struct ftrace_ops global_ops;
 

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [RFC][PATCH] x86/ftrace: Have ftrace trampolines turn read-only at the end of system boot up
  2020-05-01  0:21 [RFC][PATCH] x86/ftrace: Have ftrace trampolines turn read-only at the end of system boot up Steven Rostedt
@ 2020-05-01  4:47 ` Josh Poimboeuf
  2020-05-01  5:17   ` Josh Poimboeuf
  0 siblings, 1 reply; 7+ messages in thread
From: Josh Poimboeuf @ 2020-05-01  4:47 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: LKML, Peter Zijlstra, Ingo Molnar, Thomas Gleixner,
	Andy Lutomirski, Borislav Petkov, H. Peter Anvin

On Thu, Apr 30, 2020 at 08:21:47PM -0400, Steven Rostedt wrote:
> The cause is the "ftrace=function" would register the function tracer
> and create a trampoline, and it will set it as executable and
> read-only. Then the "trace_options=func_stack_trace" would then update
> the same trampoline to include the stack tracer version of the function
> tracer. But since the trampoline already exists, it updates it with
> text_poke_bp(). The problem is that text_poke_bp() called while
> system_state == SYSTEM_BOOTING, it will simply do a memcpy() and not
> the page mapping, as it would think that the text is still read-write.
> But in this case it is not, and we take a fault and crash.
> 
> Instead, lets keep the ftrace trampolines read-write during boot up,
> and then when the kernel executable text is set to read-only, the
> ftrace trampolines get set to read-only as well.

Would it be easier to just call a new __text_poke_bp() which skips the
SYSTEM_BOOTING check, since you know the trampoline will always be
read-only?

Like:

diff --git a/arch/x86/include/asm/text-patching.h b/arch/x86/include/asm/text-patching.h
index 67315fa3956a..710106256916 100644
--- a/arch/x86/include/asm/text-patching.h
+++ b/arch/x86/include/asm/text-patching.h
@@ -45,6 +45,7 @@ extern void *text_poke(void *addr, const void *opcode, size_t len);
 extern void text_poke_sync(void);
 extern void *text_poke_kgdb(void *addr, const void *opcode, size_t len);
 extern int poke_int3_handler(struct pt_regs *regs);
+extern void __text_poke_bp(void *addr, const void *opcode, size_t len, const void *emulate);
 extern void text_poke_bp(void *addr, const void *opcode, size_t len, const void *emulate);
 
 extern void text_poke_queue(void *addr, const void *opcode, size_t len, const void *emulate);
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index 7867dfb3963e..9cc983cc9291 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -1265,6 +1265,14 @@ void __ref text_poke_queue(void *addr, const void *opcode, size_t len, const voi
 	text_poke_loc_init(tp, addr, opcode, len, emulate);
 }
 
+void __ref __text_poke_bp(void *addr, const void *opcode, size_t len, const void *emulate)
+{
+	struct text_poke_loc tp;
+
+	text_poke_loc_init(&tp, addr, opcode, len, emulate);
+	text_poke_bp_batch(&tp, 1);
+}
+
 /**
  * text_poke_bp() -- update instructions on live kernel on SMP
  * @addr:	address to patch
@@ -1278,13 +1286,10 @@ void __ref text_poke_queue(void *addr, const void *opcode, size_t len, const voi
  */
 void __ref text_poke_bp(void *addr, const void *opcode, size_t len, const void *emulate)
 {
-	struct text_poke_loc tp;
-
 	if (unlikely(system_state == SYSTEM_BOOTING)) {
 		text_poke_early(addr, opcode, len);
 		return;
 	}
 
-	text_poke_loc_init(&tp, addr, opcode, len, emulate);
-	text_poke_bp_batch(&tp, 1);
+	__text_poke_bp(addr, opcode, len, emulate);
 }
diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
index 867c126ddabe..c36f51f01f6e 100644
--- a/arch/x86/kernel/ftrace.c
+++ b/arch/x86/kernel/ftrace.c
@@ -469,7 +469,7 @@ void arch_ftrace_update_trampoline(struct ftrace_ops *ops)
 	mutex_lock(&text_mutex);
 	/* Do a safe modify in case the trampoline is executing */
 	new = ftrace_call_replace(ip, (unsigned long)func);
-	text_poke_bp((void *)ip, new, MCOUNT_INSN_SIZE, NULL);
+	__text_poke_bp((void *)ip, new, MCOUNT_INSN_SIZE, NULL);
 	mutex_unlock(&text_mutex);
 }
 


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [RFC][PATCH] x86/ftrace: Have ftrace trampolines turn read-only at the end of system boot up
  2020-05-01  4:47 ` Josh Poimboeuf
@ 2020-05-01  5:17   ` Josh Poimboeuf
  2020-05-01 13:24     ` Steven Rostedt
  0 siblings, 1 reply; 7+ messages in thread
From: Josh Poimboeuf @ 2020-05-01  5:17 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: LKML, Peter Zijlstra, Ingo Molnar, Thomas Gleixner,
	Andy Lutomirski, Borislav Petkov, H. Peter Anvin

On Thu, Apr 30, 2020 at 11:47:33PM -0500, Josh Poimboeuf wrote:
> On Thu, Apr 30, 2020 at 08:21:47PM -0400, Steven Rostedt wrote:
> > The cause is the "ftrace=function" would register the function tracer
> > and create a trampoline, and it will set it as executable and
> > read-only. Then the "trace_options=func_stack_trace" would then update
> > the same trampoline to include the stack tracer version of the function
> > tracer. But since the trampoline already exists, it updates it with
> > text_poke_bp(). The problem is that text_poke_bp() called while
> > system_state == SYSTEM_BOOTING, it will simply do a memcpy() and not
> > the page mapping, as it would think that the text is still read-write.
> > But in this case it is not, and we take a fault and crash.
> > 
> > Instead, lets keep the ftrace trampolines read-write during boot up,
> > and then when the kernel executable text is set to read-only, the
> > ftrace trampolines get set to read-only as well.
> 
> Would it be easier to just call a new __text_poke_bp() which skips the
> SYSTEM_BOOTING check, since you know the trampoline will always be
> read-only?
> 
> Like:

early_trace_init() is called after mm_init(), so I thought it might
work, but I guess not:

[    0.206271] Starting tracer 'function'
[    0.232032] BUG: kernel NULL pointer dereference, address: 0000000000000050
[    0.232035] #PF: supervisor read access in kernel mode
[    0.232036] #PF: error_code(0x0000) - not-present page
[    0.232037] PGD 0 P4D 0 
[    0.232040] Oops: 0000 [#1] PREEMPT SMP PTI
[    0.232042] CPU: 0 PID: 0 Comm: swapper Not tainted 5.7.0-rc2-next-20200424+ #127
[    0.232043] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-2.fc30 04/01/2014
[    0.232047] RIP: 0010:walk_to_pmd+0x11/0x140
[    0.232048] Code: 78 c1 32 82 e8 a0 68 ff ff 0f 0b 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 0f 1f 44 00 00 48 89 f0 48 c1 e8 24 25 f8 0f 00 00 <48> 03 47 50 0f 84 12 01 00 00 41 54 49 89 fc 55 48 89 c5 53 48 8b
[    0.232050] RSP: 0000:ffffffff82603c68 EFLAGS: 00010046
[    0.232051] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[    0.232052] RDX: ffffffff82603cc8 RSI: 0000000000000000 RDI: 0000000000000000
[    0.232053] RBP: 8000000000000063 R08: 0000000000000001 R09: fff0000000000fff
[    0.232054] R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff82603cc8
[    0.232055] R13: 0000000000000000 R14: 0000000000000066 R15: fffffffffffffeff
[    0.232056] FS:  0000000000000000(0000) GS:ffff88813b600000(0000) knlGS:0000000000000000
[    0.232057] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    0.232058] CR2: 0000000000000050 CR3: 0000000002612001 CR4: 00000000000606b0
[    0.232061] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[    0.232062] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[    0.232063] Call Trace:
[    0.232066]  __get_locked_pte+0x19/0x100
[    0.232069]  ? 0xffffffffa0000065
[    0.232071]  __text_poke+0x14d/0x640
[    0.232073]  ? 0xffffffffa0000065
[    0.232076]  text_poke_bp_batch+0x8b/0x1c0
[    0.232078]  ? 0xffffffffa0000065
[    0.232080]  __text_poke_bp+0x3a/0x60
[    0.232083]  arch_ftrace_update_trampoline+0xac/0x300
[    0.232087]  __register_ftrace_function+0x7c/0xc0
[    0.232089]  ftrace_startup+0x1e/0xf0
[    0.232091]  register_ftrace_function+0x24/0x70
[    0.232093]  func_set_flag+0x6f/0x90
[    0.232096]  __set_tracer_option.isra.0+0x24/0x50
[    0.232098]  trace_set_options+0x149/0x160
[    0.232102]  apply_trace_boot_options+0x44/0x6d
[    0.232105]  register_tracer+0x1d8/0x1e9
[    0.232107]  early_trace_init+0x266/0x378
[    0.232109]  start_kernel+0x337/0x614
[    0.232112]  ? x86_family+0x5/0x20
[    0.232114]  secondary_startup_64+0xa4/0xb0
[    0.232119] Modules linked in:
[    0.232120] CR2: 0000000000000050
[    0.232126] random: get_random_bytes called from print_oops_end_marker+0x26/0x40 with crng_init=0
[    0.232128] ---[ end trace 71e23a89b9b5224f ]---
[    0.232130] RIP: 0010:walk_to_pmd+0x11/0x140
[    0.232131] Code: 78 c1 32 82 e8 a0 68 ff ff 0f 0b 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 0f 1f 44 00 00 48 89 f0 48 c1 e8 24 25 f8 0f 00 00 <48> 03 47 50 0f 84 12 01 00 00 41 54 49 89 fc 55 48 89 c5 53 48 8b
[    0.232133] RSP: 0000:ffffffff82603c68 EFLAGS: 00010046
[    0.232134] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[    0.232135] RDX: ffffffff82603cc8 RSI: 0000000000000000 RDI: 0000000000000000
[    0.232136] RBP: 8000000000000063 R08: 0000000000000001 R09: fff0000000000fff
[    0.232137] R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff82603cc8
[    0.232138] R13: 0000000000000000 R14: 0000000000000066 R15: fffffffffffffeff
[    0.232139] FS:  0000000000000000(0000) GS:ffff88813b600000(0000) knlGS:0000000000000000
[    0.232140] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    0.232141] CR2: 0000000000000050 CR3: 0000000002612001 CR4: 00000000000606b0
[    0.232142] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[    0.232143] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[    0.232144] Kernel panic - not syncing: Attempted to kill the idle task!
[    0.232184] ---[ end Kernel panic - not syncing: Attempted to kill the idle task! ]---


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC][PATCH] x86/ftrace: Have ftrace trampolines turn read-only at the end of system boot up
  2020-05-01  5:17   ` Josh Poimboeuf
@ 2020-05-01 13:24     ` Steven Rostedt
  2020-05-01 15:13       ` Josh Poimboeuf
  0 siblings, 1 reply; 7+ messages in thread
From: Steven Rostedt @ 2020-05-01 13:24 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: LKML, Peter Zijlstra, Ingo Molnar, Thomas Gleixner,
	Andy Lutomirski, Borislav Petkov, H. Peter Anvin

On Fri, 1 May 2020 00:17:06 -0500
Josh Poimboeuf <jpoimboe@redhat.com> wrote:

> > Would it be easier to just call a new __text_poke_bp() which skips the
> > SYSTEM_BOOTING check, since you know the trampoline will always be
> > read-only?
> > 
> > Like:  
> 
> early_trace_init() is called after mm_init(), so I thought it might
> work, but I guess not:

Yeah, I was about to say that this happens before mm_init() ;-)

It's why we already have magic for enabling function tracing the first time.

Do you see anything wrong with this current solution? It probably needs
more comments, but I wanted to get acceptance on the logic before I go and
pretty it up and send a non RFC patch.

-- Steve

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC][PATCH] x86/ftrace: Have ftrace trampolines turn read-only at the end of system boot up
  2020-05-01 13:24     ` Steven Rostedt
@ 2020-05-01 15:13       ` Josh Poimboeuf
  2020-05-01 16:19         ` Steven Rostedt
  0 siblings, 1 reply; 7+ messages in thread
From: Josh Poimboeuf @ 2020-05-01 15:13 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: LKML, Peter Zijlstra, Ingo Molnar, Thomas Gleixner,
	Andy Lutomirski, Borislav Petkov, H. Peter Anvin

On Fri, May 01, 2020 at 09:24:04AM -0400, Steven Rostedt wrote:
> On Fri, 1 May 2020 00:17:06 -0500
> Josh Poimboeuf <jpoimboe@redhat.com> wrote:
> 
> > > Would it be easier to just call a new __text_poke_bp() which skips the
> > > SYSTEM_BOOTING check, since you know the trampoline will always be
> > > read-only?
> > > 
> > > Like:  
> > 
> > early_trace_init() is called after mm_init(), so I thought it might
> > work, but I guess not:
> 
> Yeah, I was about to say that this happens before mm_init() ;-)

It happens *after* mm_init().  But now text_poke() has a dependency on
poking_init(), has a dependency on proc_caches_init(), which has a
dependency on kmem_cache_init_late(), etc.

So how early do you need early_trace_init()?  I'm assuming moving it to
after kmem_cache_init_late() would be too late.

> It's why we already have magic for enabling function tracing the first time.
> 
> Do you see anything wrong with this current solution? It probably needs
> more comments, but I wanted to get acceptance on the logic before I go and
> pretty it up and send a non RFC patch.

Assuming we can't get text_poke() working earlier, it seems reasonable
to me.

-- 
Josh


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC][PATCH] x86/ftrace: Have ftrace trampolines turn read-only at the end of system boot up
  2020-05-01 15:13       ` Josh Poimboeuf
@ 2020-05-01 16:19         ` Steven Rostedt
  2020-05-01 17:29           ` Peter Zijlstra
  0 siblings, 1 reply; 7+ messages in thread
From: Steven Rostedt @ 2020-05-01 16:19 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: LKML, Peter Zijlstra, Ingo Molnar, Thomas Gleixner,
	Andy Lutomirski, Borislav Petkov, H. Peter Anvin

On Fri, 1 May 2020 10:13:10 -0500
Josh Poimboeuf <jpoimboe@redhat.com> wrote:

> On Fri, May 01, 2020 at 09:24:04AM -0400, Steven Rostedt wrote:
> > On Fri, 1 May 2020 00:17:06 -0500
> > Josh Poimboeuf <jpoimboe@redhat.com> wrote:
> >   
> > > > Would it be easier to just call a new __text_poke_bp() which skips the
> > > > SYSTEM_BOOTING check, since you know the trampoline will always be
> > > > read-only?
> > > > 
> > > > Like:    
> > > 
> > > early_trace_init() is called after mm_init(), so I thought it might
> > > work, but I guess not:  
> > 
> > Yeah, I was about to say that this happens before mm_init() ;-)  
> 
> It happens *after* mm_init().  But now text_poke() has a dependency on
> poking_init(), has a dependency on proc_caches_init(), which has a
> dependency on kmem_cache_init_late(), etc.
> 
> So how early do you need early_trace_init()?  I'm assuming moving it to
> after kmem_cache_init_late() would be too late.

People have asked to move it even earlier. The point of having it early is
to allow tracing to debug early boot up.

> 
> > It's why we already have magic for enabling function tracing the first time.
> > 
> > Do you see anything wrong with this current solution? It probably needs
> > more comments, but I wanted to get acceptance on the logic before I go and
> > pretty it up and send a non RFC patch.  
> 
> Assuming we can't get text_poke() working earlier, it seems reasonable
> to me.
> 

Thanks.

Peter, what about you?

-- Steve

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC][PATCH] x86/ftrace: Have ftrace trampolines turn read-only at the end of system boot up
  2020-05-01 16:19         ` Steven Rostedt
@ 2020-05-01 17:29           ` Peter Zijlstra
  0 siblings, 0 replies; 7+ messages in thread
From: Peter Zijlstra @ 2020-05-01 17:29 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Josh Poimboeuf, LKML, Ingo Molnar, Thomas Gleixner,
	Andy Lutomirski, Borislav Petkov, H. Peter Anvin

On Fri, May 01, 2020 at 12:19:16PM -0400, Steven Rostedt wrote:
> 
> Peter, what about you?

It's all a bit unfortunate, but yeah, seems reasonable. Ack.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-05-01 17:29 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-01  0:21 [RFC][PATCH] x86/ftrace: Have ftrace trampolines turn read-only at the end of system boot up Steven Rostedt
2020-05-01  4:47 ` Josh Poimboeuf
2020-05-01  5:17   ` Josh Poimboeuf
2020-05-01 13:24     ` Steven Rostedt
2020-05-01 15:13       ` Josh Poimboeuf
2020-05-01 16:19         ` Steven Rostedt
2020-05-01 17:29           ` Peter Zijlstra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).