All of lore.kernel.org
 help / color / mirror / Atom feed
* WARNING at arch/x86/kernel/alternative.c:707 text_poke+0x25d/0x270
@ 2017-05-16 13:48 Steven Rostedt
  2017-05-16 15:15 ` Masami Hiramatsu
  0 siblings, 1 reply; 6+ messages in thread
From: Steven Rostedt @ 2017-05-16 13:48 UTC (permalink / raw)
  To: Masami Hiramatsu; +Cc: LKML, Thomas Gleixner

Hi Masami,

Thomas hit the following bug:

[  169.795937] trace_kprobe: Testing kprobe tracing: OK
[  169.833031] rtc_cmos 00:00: setting system clock to 2017-05-16 13:19:26 UTC (1494940766)
[  169.833986] PM: Hibernation image not present or could not be loaded.
[  169.836222] Freeing unused kernel memory: 1376K
[  169.836793] ------------[ cut here ]------------
[  169.836802] WARNING: CPU: 1 PID: 240 at /home/tglx/work/kernel/linus/linux/arch/x86/kernel/alternative.c:707 text_poke+0x25d/0x270
[  169.836803] Modules linked in:
[  169.836806] CPU: 1 PID: 240 Comm: kworker/1:1 Not tainted 4.12.0-rc1+ #138
[  169.836808] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
[  169.836812] Workqueue: events kprobe_optimizer
[  169.836819] task: ffff8802344e28c0 task.stack: ffffc90001cac000
[  169.836821] RIP: 0010:text_poke+0x25d/0x270
[  169.836822] RSP: 0000:ffffc90001cafd38 EFLAGS: 00010246
[  169.836825] RAX: 0100000000000000 RBX: ffffffff81d9daeb RCX: ffffffffa0008000
[  169.836826] RDX: 0000000000000000 RSI: ffffc90001cafd87 RDI: ffffffff81d9daeb
[  169.836827] RBP: ffffc90001cafd70 R08: 0000000000000001 R09: 0000000000000000
[  169.836828] R10: ffffc90001cafe08 R11: 0000000000000008 R12: 0000000000000001
[  169.836829] R13: ffffc90001cafd87 R14: ffffffff81d9daec R15: ffffea0000067a58
[  169.836831] FS:  0000000000000000(0000) GS:ffff88023ec40000(0000) knlGS:0000000000000000
[  169.836832] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  169.836833] CR2: ffffc9000146c000 CR3: 0000000001c0f000 CR4: 00000000003406e0
[  169.836837] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  169.836838] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  169.836839] Call Trace:
[  169.836845]  ? init_kprobe_trace+0x96/0x96
[  169.836848]  ? kprobe_trace_selftest_target+0x1/0x13
[  169.836851]  text_poke_bp+0x57/0xf0
[  169.836857]  arch_unoptimize_kprobe+0x42/0x60
[  169.836860]  arch_unoptimize_kprobes+0x35/0x90
[  169.836865]  kprobe_optimizer+0x63/0x2a0
[  169.836870]  process_one_work+0x1ed/0x630
[  169.836877]  worker_thread+0x69/0x3c0
[  169.836884]  kthread+0x138/0x170
[  169.836886]  ? process_one_work+0x630/0x630
[  169.836888]  ? kthread_create_on_node+0x70/0x70
[  169.836893]  ret_from_fork+0x31/0x40
[  169.836903] Code: e8 f9 5d 35 00 31 d2 31 f6 bf 86 02 00 00 ff 15 2a fc a2 00 e9 3d ff ff ff 48 8b 7d c8 57 9d 0f 1f 44 00 00 e8 25 95 09 00 eb ad <0f> ff e9 12 fe ff ff 66 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 
[  169.836955] ---[ end trace 3d393a48ed109eb0 ]---
[  169.837103] ------------[ cut here ]------------

It appears that the kprobe_optimizer work thread call happened after
the init pages were freed, causing alternative.c to give the above
warning because the text that is being unoptimized happens to no longer
exist.

This doesn't always happen, but it may have been something that's been
there for a while. Can you take a look at it, and perhaps find a way to
flush the work queue before exiting the kprobe test?

Thanks!

-- Steve

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: WARNING at arch/x86/kernel/alternative.c:707 text_poke+0x25d/0x270
  2017-05-16 13:48 WARNING at arch/x86/kernel/alternative.c:707 text_poke+0x25d/0x270 Steven Rostedt
@ 2017-05-16 15:15 ` Masami Hiramatsu
  2017-05-16 15:30   ` Steven Rostedt
  2017-05-16 21:42   ` Thomas Gleixner
  0 siblings, 2 replies; 6+ messages in thread
From: Masami Hiramatsu @ 2017-05-16 15:15 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: LKML, Thomas Gleixner

On Tue, 16 May 2017 09:48:02 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:

> Hi Masami,
> 
> Thomas hit the following bug:
> 
> [  169.795937] trace_kprobe: Testing kprobe tracing: OK
> [  169.833031] rtc_cmos 00:00: setting system clock to 2017-05-16 13:19:26 UTC (1494940766)
> [  169.833986] PM: Hibernation image not present or could not be loaded.
> [  169.836222] Freeing unused kernel memory: 1376K
> [  169.836793] ------------[ cut here ]------------
> [  169.836802] WARNING: CPU: 1 PID: 240 at /home/tglx/work/kernel/linus/linux/arch/x86/kernel/alternative.c:707 text_poke+0x25d/0x270
> [  169.836803] Modules linked in:
> [  169.836806] CPU: 1 PID: 240 Comm: kworker/1:1 Not tainted 4.12.0-rc1+ #138
> [  169.836808] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
> [  169.836812] Workqueue: events kprobe_optimizer
> [  169.836819] task: ffff8802344e28c0 task.stack: ffffc90001cac000
> [  169.836821] RIP: 0010:text_poke+0x25d/0x270
> [  169.836822] RSP: 0000:ffffc90001cafd38 EFLAGS: 00010246
> [  169.836825] RAX: 0100000000000000 RBX: ffffffff81d9daeb RCX: ffffffffa0008000
> [  169.836826] RDX: 0000000000000000 RSI: ffffc90001cafd87 RDI: ffffffff81d9daeb
> [  169.836827] RBP: ffffc90001cafd70 R08: 0000000000000001 R09: 0000000000000000
> [  169.836828] R10: ffffc90001cafe08 R11: 0000000000000008 R12: 0000000000000001
> [  169.836829] R13: ffffc90001cafd87 R14: ffffffff81d9daec R15: ffffea0000067a58
> [  169.836831] FS:  0000000000000000(0000) GS:ffff88023ec40000(0000) knlGS:0000000000000000
> [  169.836832] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  169.836833] CR2: ffffc9000146c000 CR3: 0000000001c0f000 CR4: 00000000003406e0
> [  169.836837] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [  169.836838] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [  169.836839] Call Trace:
> [  169.836845]  ? init_kprobe_trace+0x96/0x96
> [  169.836848]  ? kprobe_trace_selftest_target+0x1/0x13
> [  169.836851]  text_poke_bp+0x57/0xf0
> [  169.836857]  arch_unoptimize_kprobe+0x42/0x60
> [  169.836860]  arch_unoptimize_kprobes+0x35/0x90
> [  169.836865]  kprobe_optimizer+0x63/0x2a0
> [  169.836870]  process_one_work+0x1ed/0x630
> [  169.836877]  worker_thread+0x69/0x3c0
> [  169.836884]  kthread+0x138/0x170
> [  169.836886]  ? process_one_work+0x630/0x630
> [  169.836888]  ? kthread_create_on_node+0x70/0x70
> [  169.836893]  ret_from_fork+0x31/0x40
> [  169.836903] Code: e8 f9 5d 35 00 31 d2 31 f6 bf 86 02 00 00 ff 15 2a fc a2 00 e9 3d ff ff ff 48 8b 7d c8 57 9d 0f 1f 44 00 00 e8 25 95 09 00 eb ad <0f> ff e9 12 fe ff ff 66 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 
> [  169.836955] ---[ end trace 3d393a48ed109eb0 ]---
> [  169.837103] ------------[ cut here ]------------
> 
> It appears that the kprobe_optimizer work thread call happened after
> the init pages were freed, causing alternative.c to give the above
> warning because the text that is being unoptimized happens to no longer
> exist.

Ah, I see. I need to check that case. Actually for the module
init text area, kill_kprobe() correctly kicks kill_optimized_kprobe()
so it should safe. But above case is on the init-text in kernel
itself. I guess module_notifier may not be called for that area...

> 
> This doesn't always happen, but it may have been something that's been
> there for a while.

I see, since it depends on the timing when the thread is kicked.

> Can you take a look at it, and perhaps find a way to
> flush the work queue before exiting the kprobe test?

Yes, I must handle it.

Thank you!

> 
> Thanks!
> 
> -- Steve


-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: WARNING at arch/x86/kernel/alternative.c:707 text_poke+0x25d/0x270
  2017-05-16 15:15 ` Masami Hiramatsu
@ 2017-05-16 15:30   ` Steven Rostedt
  2017-05-16 16:27     ` Masami Hiramatsu
  2017-05-16 21:42   ` Thomas Gleixner
  1 sibling, 1 reply; 6+ messages in thread
From: Steven Rostedt @ 2017-05-16 15:30 UTC (permalink / raw)
  To: Masami Hiramatsu; +Cc: LKML, Thomas Gleixner

On Wed, 17 May 2017 00:15:39 +0900
Masami Hiramatsu <mhiramat@kernel.org> wrote:


> > It appears that the kprobe_optimizer work thread call happened after
> > the init pages were freed, causing alternative.c to give the above
> > warning because the text that is being unoptimized happens to no longer
> > exist.  
> 
> Ah, I see. I need to check that case. Actually for the module
> init text area, kill_kprobe() correctly kicks kill_optimized_kprobe()
> so it should safe. But above case is on the init-text in kernel
> itself. I guess module_notifier may not be called for that area...

Hmm, what happens if you add a kprobe to a module, remove it, and then
remove the module. If the module is still loaded when it is removed,
wouldn't that cause the optimized probe to be delayed? Wouldn't that
open a race where the optimizer work queue can be called when no module
exists?

-- Steve

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: WARNING at arch/x86/kernel/alternative.c:707 text_poke+0x25d/0x270
  2017-05-16 15:30   ` Steven Rostedt
@ 2017-05-16 16:27     ` Masami Hiramatsu
  0 siblings, 0 replies; 6+ messages in thread
From: Masami Hiramatsu @ 2017-05-16 16:27 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: LKML, Thomas Gleixner

On Tue, 16 May 2017 11:30:19 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:

> On Wed, 17 May 2017 00:15:39 +0900
> Masami Hiramatsu <mhiramat@kernel.org> wrote:
> 
> 
> > > It appears that the kprobe_optimizer work thread call happened after
> > > the init pages were freed, causing alternative.c to give the above
> > > warning because the text that is being unoptimized happens to no longer
> > > exist.  
> > 
> > Ah, I see. I need to check that case. Actually for the module
> > init text area, kill_kprobe() correctly kicks kill_optimized_kprobe()
> > so it should safe. But above case is on the init-text in kernel
> > itself. I guess module_notifier may not be called for that area...
> 
> Hmm, what happens if you add a kprobe to a module, remove it, and then
> remove the module.

In that case the kprobe will be removed before unloading the module.
If the kprobe is optimized, the optimized probe itself is queued
to unoptimize.

> If the module is still loaded when it is removed,
> wouldn't that cause the optimized probe to be delayed?

Yes, it is queued, but before unloading module, it should be
dequeued forcibly by kill_optimized_kprobe() from module-notifier.

> Wouldn't that
> open a race where the optimizer work queue can be called when no module
> exists?

Since both kprobe_mutex and module_mutex are held in optimizer,
I couldn't think there is such race. I guess when the kernel's
init-text (not module's one) is freed, it doesn't kick the
module notifier and kprobes missed that.

Thank you,

-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: WARNING at arch/x86/kernel/alternative.c:707 text_poke+0x25d/0x270
  2017-05-16 15:15 ` Masami Hiramatsu
  2017-05-16 15:30   ` Steven Rostedt
@ 2017-05-16 21:42   ` Thomas Gleixner
  2017-05-17  1:38     ` Masami Hiramatsu
  1 sibling, 1 reply; 6+ messages in thread
From: Thomas Gleixner @ 2017-05-16 21:42 UTC (permalink / raw)
  To: Masami Hiramatsu; +Cc: Steven Rostedt, LKML

On Wed, 17 May 2017, Masami Hiramatsu wrote:
> On Tue, 16 May 2017 09:48:02 -0400
> Steven Rostedt <rostedt@goodmis.org> wrote:
> > It appears that the kprobe_optimizer work thread call happened after
> > the init pages were freed, causing alternative.c to give the above
> > warning because the text that is being unoptimized happens to no longer
> > exist.
> 
> Ah, I see. I need to check that case. Actually for the module
> init text area, kill_kprobe() correctly kicks kill_optimized_kprobe()
> so it should safe. But above case is on the init-text in kernel
> itself. I guess module_notifier may not be called for that area...

Find below the patch I'm using for now .

Thanks,

	tglx

8<--------------------

--- a/include/linux/kprobes.h
+++ b/include/linux/kprobes.h
@@ -349,6 +349,9 @@ extern int proc_kprobes_optimization_han
 					     int write, void __user *buffer,
 					     size_t *length, loff_t *ppos);
 #endif
+extern void wait_for_kprobe_optimizer(void);
+#else
+static inline void wait_for_kprobe_optimizer(void) { }
 #endif /* CONFIG_OPTPROBES */
 #ifdef CONFIG_KPROBES_ON_FTRACE
 extern void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -595,7 +595,7 @@ static void kprobe_optimizer(struct work
 }
 
 /* Wait for completing optimization and unoptimization */
-static void wait_for_kprobe_optimizer(void)
+void wait_for_kprobe_optimizer(void)
 {
 	mutex_lock(&kprobe_mutex);
 
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -1535,6 +1535,11 @@ static __init int kprobe_trace_self_test
 
 end:
 	release_all_trace_kprobes();
+	/*
+	 * Wait for the optimizer work to finish. Otherwise it might fiddle
+	 * with probes in already freed __init text.
+	 */
+	wait_for_kprobe_optimizer();
 	if (warn)
 		pr_cont("NG: Some tests are failed. Please check them.\n");
 	else

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: WARNING at arch/x86/kernel/alternative.c:707 text_poke+0x25d/0x270
  2017-05-16 21:42   ` Thomas Gleixner
@ 2017-05-17  1:38     ` Masami Hiramatsu
  0 siblings, 0 replies; 6+ messages in thread
From: Masami Hiramatsu @ 2017-05-17  1:38 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: Steven Rostedt, LKML

On Tue, 16 May 2017 23:42:58 +0200 (CEST)
Thomas Gleixner <tglx@linutronix.de> wrote:

> On Wed, 17 May 2017, Masami Hiramatsu wrote:
> > On Tue, 16 May 2017 09:48:02 -0400
> > Steven Rostedt <rostedt@goodmis.org> wrote:
> > > It appears that the kprobe_optimizer work thread call happened after
> > > the init pages were freed, causing alternative.c to give the above
> > > warning because the text that is being unoptimized happens to no longer
> > > exist.
> > 
> > Ah, I see. I need to check that case. Actually for the module
> > init text area, kill_kprobe() correctly kicks kill_optimized_kprobe()
> > so it should safe. But above case is on the init-text in kernel
> > itself. I guess module_notifier may not be called for that area...
> 
> Find below the patch I'm using for now .

Thank you very much! It looks good to me.

Acked-by: Masami Hiramatsu <mhiramat@kernel.org>

> 
> Thanks,
> 
> 	tglx
> 
> 8<--------------------
> 
> --- a/include/linux/kprobes.h
> +++ b/include/linux/kprobes.h
> @@ -349,6 +349,9 @@ extern int proc_kprobes_optimization_han
>  					     int write, void __user *buffer,
>  					     size_t *length, loff_t *ppos);
>  #endif
> +extern void wait_for_kprobe_optimizer(void);
> +#else
> +static inline void wait_for_kprobe_optimizer(void) { }
>  #endif /* CONFIG_OPTPROBES */
>  #ifdef CONFIG_KPROBES_ON_FTRACE
>  extern void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
> --- a/kernel/kprobes.c
> +++ b/kernel/kprobes.c
> @@ -595,7 +595,7 @@ static void kprobe_optimizer(struct work
>  }
>  
>  /* Wait for completing optimization and unoptimization */
> -static void wait_for_kprobe_optimizer(void)
> +void wait_for_kprobe_optimizer(void)
>  {
>  	mutex_lock(&kprobe_mutex);
>  
> --- a/kernel/trace/trace_kprobe.c
> +++ b/kernel/trace/trace_kprobe.c
> @@ -1535,6 +1535,11 @@ static __init int kprobe_trace_self_test
>  
>  end:
>  	release_all_trace_kprobes();
> +	/*
> +	 * Wait for the optimizer work to finish. Otherwise it might fiddle
> +	 * with probes in already freed __init text.
> +	 */
> +	wait_for_kprobe_optimizer();
>  	if (warn)
>  		pr_cont("NG: Some tests are failed. Please check them.\n");
>  	else


-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2017-05-17  1:38 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-16 13:48 WARNING at arch/x86/kernel/alternative.c:707 text_poke+0x25d/0x270 Steven Rostedt
2017-05-16 15:15 ` Masami Hiramatsu
2017-05-16 15:30   ` Steven Rostedt
2017-05-16 16:27     ` Masami Hiramatsu
2017-05-16 21:42   ` Thomas Gleixner
2017-05-17  1:38     ` Masami Hiramatsu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.