linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] kprobes, x86/alternatives: use text_mutex to protect smp_alt_modules
@ 2017-10-27  9:34 Zhou Chengming
  2017-10-27 11:15 ` Borislav Petkov
  2017-10-30  8:03 ` Masami Hiramatsu
  0 siblings, 2 replies; 13+ messages in thread
From: Zhou Chengming @ 2017-10-27  9:34 UTC (permalink / raw)
  To: mhiramat, ananth, anil.s.keshavamurthy, davem, hpa, tglx, bp,
	peterz, jkosina, rostedt, mjurczyk
  Cc: x86, linux-kernel, zhouchengming1

Fixes: 2cfa197 "ftrace/alternatives: Introducing *_text_reserved
functions"

We use alternatives_text_reserved() to check if the address is in
the fixed pieces of alternative reserved, but the problem is that
we don't hold the smp_alt mutex when call this function. So the list
traversal may encounter a deleted list_head if another path is doing
alternatives_smp_module_del().

One solution is that we can hold smp_alt mutex before call this
function, but the difficult point is that the callers of this
functions, arch_prepare_kprobe() and arch_prepare_optimized_kprobe(),
are called inside the text_mutex. So we must hold smp_alt mutex
before we go into these arch dependent code. But we can't now,
the smp_alt mutex is the arch dependent part, only x86 has it.
Maybe we can export another arch dependent callback to solve this.

But there is a simpler way to handle this problem. We can reuse the
text_mutex to protect smp_alt_modules instead of using another mutex.
And all the arch dependent checks of kprobes are inside the text_mutex,
so it's safe now.

Signed-off-by: Zhou Chengming <zhouchengming1@huawei.com>
---
 arch/x86/kernel/alternative.c | 24 +++++++++++-------------
 1 file changed, 11 insertions(+), 13 deletions(-)

diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index 3344d33..55abbaa 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -442,7 +442,6 @@ static void alternatives_smp_lock(const s32 *start, const s32 *end,
 {
 	const s32 *poff;
 
-	mutex_lock(&text_mutex);
 	for (poff = start; poff < end; poff++) {
 		u8 *ptr = (u8 *)poff + *poff;
 
@@ -452,7 +451,6 @@ static void alternatives_smp_lock(const s32 *start, const s32 *end,
 		if (*ptr == 0x3e)
 			text_poke(ptr, ((unsigned char []){0xf0}), 1);
 	}
-	mutex_unlock(&text_mutex);
 }
 
 static void alternatives_smp_unlock(const s32 *start, const s32 *end,
@@ -460,7 +458,6 @@ static void alternatives_smp_unlock(const s32 *start, const s32 *end,
 {
 	const s32 *poff;
 
-	mutex_lock(&text_mutex);
 	for (poff = start; poff < end; poff++) {
 		u8 *ptr = (u8 *)poff + *poff;
 
@@ -470,7 +467,6 @@ static void alternatives_smp_unlock(const s32 *start, const s32 *end,
 		if (*ptr == 0xf0)
 			text_poke(ptr, ((unsigned char []){0x3E}), 1);
 	}
-	mutex_unlock(&text_mutex);
 }
 
 struct smp_alt_module {
@@ -489,8 +485,7 @@ struct smp_alt_module {
 	struct list_head next;
 };
 static LIST_HEAD(smp_alt_modules);
-static DEFINE_MUTEX(smp_alt);
-static bool uniproc_patched = false;	/* protected by smp_alt */
+static bool uniproc_patched = false;	/* protected by text_mutex */
 
 void __init_or_module alternatives_smp_module_add(struct module *mod,
 						  char *name,
@@ -499,7 +494,7 @@ void __init_or_module alternatives_smp_module_add(struct module *mod,
 {
 	struct smp_alt_module *smp;
 
-	mutex_lock(&smp_alt);
+	mutex_lock(&text_mutex);
 	if (!uniproc_patched)
 		goto unlock;
 
@@ -526,14 +521,14 @@ void __init_or_module alternatives_smp_module_add(struct module *mod,
 smp_unlock:
 	alternatives_smp_unlock(locks, locks_end, text, text_end);
 unlock:
-	mutex_unlock(&smp_alt);
+	mutex_unlock(&text_mutex);
 }
 
 void __init_or_module alternatives_smp_module_del(struct module *mod)
 {
 	struct smp_alt_module *item;
 
-	mutex_lock(&smp_alt);
+	mutex_lock(&text_mutex);
 	list_for_each_entry(item, &smp_alt_modules, next) {
 		if (mod != item->mod)
 			continue;
@@ -541,7 +536,7 @@ void __init_or_module alternatives_smp_module_del(struct module *mod)
 		kfree(item);
 		break;
 	}
-	mutex_unlock(&smp_alt);
+	mutex_unlock(&text_mutex);
 }
 
 void alternatives_enable_smp(void)
@@ -551,7 +546,7 @@ void alternatives_enable_smp(void)
 	/* Why bother if there are no other CPUs? */
 	BUG_ON(num_possible_cpus() == 1);
 
-	mutex_lock(&smp_alt);
+	mutex_lock(&text_mutex);
 
 	if (uniproc_patched) {
 		pr_info("switching to SMP code\n");
@@ -563,10 +558,13 @@ void alternatives_enable_smp(void)
 					      mod->text, mod->text_end);
 		uniproc_patched = false;
 	}
-	mutex_unlock(&smp_alt);
+	mutex_unlock(&text_mutex);
 }
 
-/* Return 1 if the address range is reserved for smp-alternatives */
+/*
+ * Return 1 if the address range is reserved for smp-alternatives.
+ * Must hold text_mutex.
+ */
 int alternatives_text_reserved(void *start, void *end)
 {
 	struct smp_alt_module *mod;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH] kprobes, x86/alternatives: use text_mutex to protect smp_alt_modules
  2017-10-27  9:34 [PATCH] kprobes, x86/alternatives: use text_mutex to protect smp_alt_modules Zhou Chengming
@ 2017-10-27 11:15 ` Borislav Petkov
  2017-10-27 11:42   ` zhouchengming
  2017-10-30  8:03 ` Masami Hiramatsu
  1 sibling, 1 reply; 13+ messages in thread
From: Borislav Petkov @ 2017-10-27 11:15 UTC (permalink / raw)
  To: Zhou Chengming
  Cc: mhiramat, ananth, anil.s.keshavamurthy, davem, hpa, tglx, peterz,
	jkosina, rostedt, mjurczyk, x86, linux-kernel

On Fri, Oct 27, 2017 at 05:34:44PM +0800, Zhou Chengming wrote:
> Fixes: 2cfa197 "ftrace/alternatives: Introducing *_text_reserved
> functions"
> 
> We use alternatives_text_reserved() to check if the address is in
> the fixed pieces of alternative reserved, but the problem is that
> we don't hold the smp_alt mutex when call this function. So the list
> traversal may encounter a deleted list_head if another path is doing
> alternatives_smp_module_del().

Is this something you've triggered on a real machine or is this just
from code staring?

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] kprobes, x86/alternatives: use text_mutex to protect smp_alt_modules
  2017-10-27 11:15 ` Borislav Petkov
@ 2017-10-27 11:42   ` zhouchengming
  2017-10-27 12:33     ` Borislav Petkov
  0 siblings, 1 reply; 13+ messages in thread
From: zhouchengming @ 2017-10-27 11:42 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: mhiramat, ananth, anil.s.keshavamurthy, davem, hpa, tglx, peterz,
	jkosina, rostedt, mjurczyk, x86, linux-kernel

On 2017/10/27 19:15, Borislav Petkov wrote:
> On Fri, Oct 27, 2017 at 05:34:44PM +0800, Zhou Chengming wrote:
>> Fixes: 2cfa197 "ftrace/alternatives: Introducing *_text_reserved
>> functions"
>>
>> We use alternatives_text_reserved() to check if the address is in
>> the fixed pieces of alternative reserved, but the problem is that
>> we don't hold the smp_alt mutex when call this function. So the list
>> traversal may encounter a deleted list_head if another path is doing
>> alternatives_smp_module_del().
> Is this something you've triggered on a real machine or is this just
> from code staring?
>

Hi, thanks for your reply.
This is a real bug happened on one of our machines, below is the calltrace.
We can see the trigger is at alternatives_text_reserved+0x20/0x80, and
encounter a deleted (poisoned) list_head.

[   14.016190] general protection fault: 0000 [#1] SMP
[   14.016988] CPU 0
[   14.017287] Modules linked in: mlx4_ib(O+) mlx4_en(O+) ib_sa(O) ib_mad(O) ib_core(O) ib_addr(O) ipv6 mlx4_core(O) compat(O) igb(O) rtc_cmos dca button ata_piix ahci libahci libata ext3 jbd mbcache usbhid hid uhci_hcd ehci_hcd processor thermal_sys hwmon usbcore usb_common sd_mod crc_t10dif virtio_console(O) virtio_pci(O) virtio_net(O) kvm_ivshmem(O) virtio_scsi(O) scsi_mod virtio_blk(O) virtio(O) virtio_ring(O) pv_channel(O)
[   14.020005]
[   14.020005] Pid: 1979, comm: modprobe Tainted: G           O 3.4.24.19-0.11-default #1 QEMU Standard PC (i440FX + PIIX, 1996)
[   14.020005] RIP: 0010:[<ffffffff81007eb0>]  [<ffffffff81007eb0>] alternatives_text_reserved+0x20/0x80
[   14.020005] RSP: 0018:ffff880ea355bcb8  EFLAGS: 00010283
[   14.020005] RAX: dead000000000100 RBX: ffffffffa02af720 RCX: dead0000000000d0
[   14.020005] RDX: ffffffffa02f0588 RSI: ffffffffa02d2fc0 RDI: ffffffffa02d2fc0
[   14.020005] RBP: ffff880ea355bcb8 R08: ffffffffa02f3b68 R09: 00017f4ae12d2fc0
[   14.020005] R10: 00000000000000e8 R11: ffffffffa02bb9d7 R12: 0000000000000000
[   14.020005] R13: ffffffffa02af720 R14: ffffffffa0307140 R15: ffffffffa02af730
[   14.020005] FS:  00007f26c6acc700(0000) GS:ffff880fff200000(0000) knlGS:0000000000000000
[   14.020005] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[   14.020005] CR2: 00007fd4adc3b000 CR3: 0000000ea40ea000 CR4: 00000000001407f0
[   14.020005] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   14.020005] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[   14.020005] Process modprobe (pid: 1979, threadinfo ffff880ea355a000, task ffff880e9eb9e600)
[   14.020005] Stack:
[   14.020005]  ffff880ea355bcd8 ffffffff8145b148 ffffffffa02af720 ffffffffa02af720
[   14.020005]  ffff880ea355bd18 ffffffff8145ed38 0000000000000000 0000000000000000
[   14.041015]  ffff880ea355bd90 ffffffffa02af720 0000000000000001 ffff880ea355bd90
[   14.041015] Call Trace:
[   14.041015]  [<ffffffff8145b148>] arch_prepare_kprobe+0x18/0x80
[   14.042982]  [<ffffffff8145ed38>] register_kprobe+0x338/0x4c0
[   14.042982]  [<ffffffff8145f658>] register_jprobes+0x98/0xc0
[   14.042982]  [<ffffffff8145f69a>] register_jprobe+0x1a/0x20
[   14.042982]  [<ffffffffa02a5f5d>] mlx4_stats_sysfs_create+0x2d/0x150 [mlx4_en]
[   14.042982]  [<ffffffffa02a4087>] mlx4_en_init_netdev+0xc77/0xe50 [mlx4_en]
[   14.042982]  [<ffffffffa029263a>] mlx4_en_add+0x44a/0x550 [mlx4_en]
[   14.042982]  [<ffffffffa02d2d1c>] mlx4_add_device+0x4c/0xe0 [mlx4_core]
[   14.042982]  [<ffffffffa02b6000>] ? 0xffffffffa02b5fff
[   14.042982]  [<ffffffffa02d2e23>] mlx4_register_interface+0x73/0xb0 [mlx4_core]
[   14.042982]  [<ffffffffa02b6000>] ? 0xffffffffa02b5fff
[   14.042982]  [<ffffffffa02b6031>] __init_backport+0x31/0x1000 [mlx4_en]
[   14.042982]  [<ffffffff810002b2>] do_one_initcall+0x122/0x180
[   14.042982]  [<ffffffff8109ed2d>] sys_init_module+0xbd/0x220
[   14.042982]  [<ffffffff81460c99>] system_call_fastpath+0x16/0x1b
[   14.042982] Code: 66 66 2e 0f 1f 84 00 00 00 00 00 48 8b 05 49 e9 80 00 55 48 89 e5 48 3d e0 67 81 81 48 8d 48 d0 74 59 66 0f 1f 84 00 00 00 00 00 <48> 3b 71 20 72 3a 48 3b 79 28 77 34 48 8b 41 10 4c 8b 41 18 4c
[   14.042982] RIP  [<ffffffff81007eb0>] alternatives_text_reserved+0x20/0x80
[   14.042982]  RSP <ffff880ea355bcb8>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] kprobes, x86/alternatives: use text_mutex to protect smp_alt_modules
  2017-10-27 11:42   ` zhouchengming
@ 2017-10-27 12:33     ` Borislav Petkov
  2017-10-27 13:30       ` zhouchengming
  2017-10-27 14:15       ` Peter Zijlstra
  0 siblings, 2 replies; 13+ messages in thread
From: Borislav Petkov @ 2017-10-27 12:33 UTC (permalink / raw)
  To: zhouchengming
  Cc: mhiramat, ananth, anil.s.keshavamurthy, davem, hpa, tglx, peterz,
	jkosina, rostedt, mjurczyk, x86, linux-kernel

On Fri, Oct 27, 2017 at 07:42:45PM +0800, zhouchengming wrote:
> This is a real bug happened on one of our machines, below is the calltrace.
> We can see the trigger is at alternatives_text_reserved+0x20/0x80, and
> encounter a deleted (poisoned) list_head.

Looks like some out-of-tree, old kernel thing. We don't have
mlx4_stats_sysfs_create() upstream and looking at the boot timestamps,
it could be that register_jprobe() is not ready yet.

Looking at the Code, though:

  20:   74 59                   je     0x7b
  22:   66 0f 1f 84 00 00 00    nopw   0x0(%rax,%rax,1)
  29:   00 00 
  2b:*  48 3b 71 20             cmp    0x20(%rcx),%rsi          <-- trapping instruction
  2f:   72 3a                   jb     0x6b
  31:   48 3b 79 28             cmp    0x28(%rcx),%rdi
  35:   77 34                   ja     0x6b

%rcx is 0xdead0000000000d0 and that is POISON_POINTER_DELTA + 0xd0 so
that looks more like smp_alt_modules is not initialized yet but I could
could very well be wrong because this is an old kernel. So trigger that
with the upstream kernel without out of tree modules.

Thx.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] kprobes, x86/alternatives: use text_mutex to protect smp_alt_modules
  2017-10-27 12:33     ` Borislav Petkov
@ 2017-10-27 13:30       ` zhouchengming
  2017-10-28  8:43         ` Masami Hiramatsu
  2017-10-27 14:15       ` Peter Zijlstra
  1 sibling, 1 reply; 13+ messages in thread
From: zhouchengming @ 2017-10-27 13:30 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: mhiramat, ananth, anil.s.keshavamurthy, davem, hpa, tglx, peterz,
	jkosina, rostedt, mjurczyk, x86, linux-kernel

On 2017/10/27 20:33, Borislav Petkov wrote:
> On Fri, Oct 27, 2017 at 07:42:45PM +0800, zhouchengming wrote:
>> This is a real bug happened on one of our machines, below is the calltrace.
>> We can see the trigger is at alternatives_text_reserved+0x20/0x80, and
>> encounter a deleted (poisoned) list_head.
> Looks like some out-of-tree, old kernel thing. We don't have
> mlx4_stats_sysfs_create() upstream and looking at the boot timestamps,
> it could be that register_jprobe() is not ready yet.

Yes, it's an out-of-tree module, loaded when boot kernel. register_kprobe()
maybe not ready yet, but the bug is not caused by it obviously.

>
> Looking at the Code, though:
>
>    20:   74 59                   je     0x7b
>    22:   66 0f 1f 84 00 00 00    nopw   0x0(%rax,%rax,1)
>    29:   00 00
>    2b:*  48 3b 71 20             cmp    0x20(%rcx),%rsi<-- trapping instruction
>    2f:   72 3a                   jb     0x6b
>    31:   48 3b 79 28             cmp    0x28(%rcx),%rdi
>    35:   77 34                   ja     0x6b
>
> %rcx is 0xdead0000000000d0 and that is POISON_POINTER_DELTA + 0xd0 so
> that looks more like smp_alt_modules is not initialized yet but I could
> could very well be wrong because this is an old kernel. So trigger that
> with the upstream kernel without out of tree modules.

The smp_alt_modules is defined by LIST_HEAD, so it's initialized at start.

A deleted list_head->next = LIST_POISON1 = 0xdead000000000000 + 0x100, then
container_of() to get the struct smp_alt_module: -0x30 = 0xdead0000000000d0

Obviously, it's a deleted list_head, and I have explained clearly how it happen in
the patch comment.

Thanks.

> Thx.
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] kprobes, x86/alternatives: use text_mutex to protect smp_alt_modules
  2017-10-27 12:33     ` Borislav Petkov
  2017-10-27 13:30       ` zhouchengming
@ 2017-10-27 14:15       ` Peter Zijlstra
  2017-10-28  1:26         ` zhouchengming
  1 sibling, 1 reply; 13+ messages in thread
From: Peter Zijlstra @ 2017-10-27 14:15 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: zhouchengming, mhiramat, ananth, anil.s.keshavamurthy, davem,
	hpa, tglx, jkosina, rostedt, mjurczyk, x86, linux-kernel

On Fri, Oct 27, 2017 at 02:33:48PM +0200, Borislav Petkov wrote:
> On Fri, Oct 27, 2017 at 07:42:45PM +0800, zhouchengming wrote:
> > This is a real bug happened on one of our machines, below is the calltrace.
> > We can see the trigger is at alternatives_text_reserved+0x20/0x80, and
> > encounter a deleted (poisoned) list_head.
> 
> Looks like some out-of-tree, old kernel thing. We don't have
> mlx4_stats_sysfs_create() upstream and looking at the boot timestamps,
> it could be that register_jprobe() is not ready yet.
> 
> Looking at the Code, though:
> 
>   20:   74 59                   je     0x7b
>   22:   66 0f 1f 84 00 00 00    nopw   0x0(%rax,%rax,1)
>   29:   00 00 
>   2b:*  48 3b 71 20             cmp    0x20(%rcx),%rsi          <-- trapping instruction
>   2f:   72 3a                   jb     0x6b
>   31:   48 3b 79 28             cmp    0x28(%rcx),%rdi
>   35:   77 34                   ja     0x6b
> 
> %rcx is 0xdead0000000000d0 and that is POISON_POINTER_DELTA + 0xd0 so
> that looks more like smp_alt_modules is not initialized yet but I could
> could very well be wrong because this is an old kernel. So trigger that
> with the upstream kernel without out of tree modules.

Not to mention that we're about (or just have) yanked jprobes out of the
kernel entirely.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] kprobes, x86/alternatives: use text_mutex to protect smp_alt_modules
  2017-10-27 14:15       ` Peter Zijlstra
@ 2017-10-28  1:26         ` zhouchengming
  2017-10-28  8:44           ` Masami Hiramatsu
  0 siblings, 1 reply; 13+ messages in thread
From: zhouchengming @ 2017-10-28  1:26 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Borislav Petkov, mhiramat, ananth, anil.s.keshavamurthy, davem,
	hpa, tglx, jkosina, rostedt, mjurczyk, x86, linux-kernel

On 2017/10/27 22:15, Peter Zijlstra wrote:
> On Fri, Oct 27, 2017 at 02:33:48PM +0200, Borislav Petkov wrote:
>> On Fri, Oct 27, 2017 at 07:42:45PM +0800, zhouchengming wrote:
>>> This is a real bug happened on one of our machines, below is the calltrace.
>>> We can see the trigger is at alternatives_text_reserved+0x20/0x80, and
>>> encounter a deleted (poisoned) list_head.
>> Looks like some out-of-tree, old kernel thing. We don't have
>> mlx4_stats_sysfs_create() upstream and looking at the boot timestamps,
>> it could be that register_jprobe() is not ready yet.
>>
>> Looking at the Code, though:
>>
>>    20:   74 59                   je     0x7b
>>    22:   66 0f 1f 84 00 00 00    nopw   0x0(%rax,%rax,1)
>>    29:   00 00
>>    2b:*  48 3b 71 20             cmp    0x20(%rcx),%rsi<-- trapping instruction
>>    2f:   72 3a                   jb     0x6b
>>    31:   48 3b 79 28             cmp    0x28(%rcx),%rdi
>>    35:   77 34                   ja     0x6b
>>
>> %rcx is 0xdead0000000000d0 and that is POISON_POINTER_DELTA + 0xd0 so
>> that looks more like smp_alt_modules is not initialized yet but I could
>> could very well be wrong because this is an old kernel. So trigger that
>> with the upstream kernel without out of tree modules.
> Not to mention that we're about (or just have) yanked jprobes out of the
> kernel entirely.

Well... but this is a bug of alternatives_text_reserved(), it traverse the list without holding
the smp_alt mutex. So all users of it, like kprobes, will still have this problem. Maybe I could
think of a way to get rid of the mutex entirely.

Thanks.

> .
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] kprobes, x86/alternatives: use text_mutex to protect smp_alt_modules
  2017-10-27 13:30       ` zhouchengming
@ 2017-10-28  8:43         ` Masami Hiramatsu
  2017-10-28  9:51           ` zhouchengming
  0 siblings, 1 reply; 13+ messages in thread
From: Masami Hiramatsu @ 2017-10-28  8:43 UTC (permalink / raw)
  To: zhouchengming
  Cc: Borislav Petkov, mhiramat, ananth, anil.s.keshavamurthy, davem,
	hpa, tglx, peterz, jkosina, rostedt, mjurczyk, x86, linux-kernel

On Fri, 27 Oct 2017 21:30:24 +0800
zhouchengming <zhouchengming1@huawei.com> wrote:

> On 2017/10/27 20:33, Borislav Petkov wrote:
> > On Fri, Oct 27, 2017 at 07:42:45PM +0800, zhouchengming wrote:
> >> This is a real bug happened on one of our machines, below is the calltrace.
> >> We can see the trigger is at alternatives_text_reserved+0x20/0x80, and
> >> encounter a deleted (poisoned) list_head.
> > Looks like some out-of-tree, old kernel thing. We don't have
> > mlx4_stats_sysfs_create() upstream and looking at the boot timestamps,
> > it could be that register_jprobe() is not ready yet.
> 
> Yes, it's an out-of-tree module, loaded when boot kernel. register_kprobe()
> maybe not ready yet, but the bug is not caused by it obviously.
> 
> >
> > Looking at the Code, though:
> >
> >    20:   74 59                   je     0x7b
> >    22:   66 0f 1f 84 00 00 00    nopw   0x0(%rax,%rax,1)
> >    29:   00 00
> >    2b:*  48 3b 71 20             cmp    0x20(%rcx),%rsi<-- trapping instruction
> >    2f:   72 3a                   jb     0x6b
> >    31:   48 3b 79 28             cmp    0x28(%rcx),%rdi
> >    35:   77 34                   ja     0x6b
> >
> > %rcx is 0xdead0000000000d0 and that is POISON_POINTER_DELTA + 0xd0 so
> > that looks more like smp_alt_modules is not initialized yet but I could
> > could very well be wrong because this is an old kernel. So trigger that
> > with the upstream kernel without out of tree modules.
> 
> The smp_alt_modules is defined by LIST_HEAD, so it's initialized at start.
> 
> A deleted list_head->next = LIST_POISON1 = 0xdead000000000000 + 0x100, then
> container_of() to get the struct smp_alt_module: -0x30 = 0xdead0000000000d0
> 
> Obviously, it's a deleted list_head, and I have explained clearly how it happen in
> the patch comment.

Ah, I see. It looks alternatives_text_reserved() bug at a glance.
But simply adding smp_alt mutex to alternatives_text_reserved() causes
ABBA deadlock in the kprobe's path.
So your solution is to replace the smp_alt with text_mutex, since 
alternatives_text_reserved is x86 specific function.

Hmm, let me see... I agree that will be a simple way to solve, but
it also means we have 2 resources protected by text_mutex.

Thank you,


-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] kprobes, x86/alternatives: use text_mutex to protect smp_alt_modules
  2017-10-28  1:26         ` zhouchengming
@ 2017-10-28  8:44           ` Masami Hiramatsu
  0 siblings, 0 replies; 13+ messages in thread
From: Masami Hiramatsu @ 2017-10-28  8:44 UTC (permalink / raw)
  To: zhouchengming
  Cc: Peter Zijlstra, Borislav Petkov, mhiramat, ananth,
	anil.s.keshavamurthy, davem, hpa, tglx, jkosina, rostedt,
	mjurczyk, x86, linux-kernel

On Sat, 28 Oct 2017 09:26:56 +0800
zhouchengming <zhouchengming1@huawei.com> wrote:

> On 2017/10/27 22:15, Peter Zijlstra wrote:
> > On Fri, Oct 27, 2017 at 02:33:48PM +0200, Borislav Petkov wrote:
> >> On Fri, Oct 27, 2017 at 07:42:45PM +0800, zhouchengming wrote:
> >>> This is a real bug happened on one of our machines, below is the calltrace.
> >>> We can see the trigger is at alternatives_text_reserved+0x20/0x80, and
> >>> encounter a deleted (poisoned) list_head.
> >> Looks like some out-of-tree, old kernel thing. We don't have
> >> mlx4_stats_sysfs_create() upstream and looking at the boot timestamps,
> >> it could be that register_jprobe() is not ready yet.
> >>
> >> Looking at the Code, though:
> >>
> >>    20:   74 59                   je     0x7b
> >>    22:   66 0f 1f 84 00 00 00    nopw   0x0(%rax,%rax,1)
> >>    29:   00 00
> >>    2b:*  48 3b 71 20             cmp    0x20(%rcx),%rsi<-- trapping instruction
> >>    2f:   72 3a                   jb     0x6b
> >>    31:   48 3b 79 28             cmp    0x28(%rcx),%rdi
> >>    35:   77 34                   ja     0x6b
> >>
> >> %rcx is 0xdead0000000000d0 and that is POISON_POINTER_DELTA + 0xd0 so
> >> that looks more like smp_alt_modules is not initialized yet but I could
> >> could very well be wrong because this is an old kernel. So trigger that
> >> with the upstream kernel without out of tree modules.
> > Not to mention that we're about (or just have) yanked jprobes out of the
> > kernel entirely.
> 
> Well... but this is a bug of alternatives_text_reserved(), it traverse the list without holding
> the smp_alt mutex. So all users of it, like kprobes, will still have this problem. Maybe I could
> think of a way to get rid of the mutex entirely.

Correct. Peter, this is not related to jprobes, register_kprobe has same issue.

Thank you,



-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] kprobes, x86/alternatives: use text_mutex to protect smp_alt_modules
  2017-10-28  8:43         ` Masami Hiramatsu
@ 2017-10-28  9:51           ` zhouchengming
  0 siblings, 0 replies; 13+ messages in thread
From: zhouchengming @ 2017-10-28  9:51 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Borislav Petkov, ananth, anil.s.keshavamurthy, davem, hpa, tglx,
	peterz, jkosina, rostedt, mjurczyk, x86, linux-kernel

On 2017/10/28 16:43, Masami Hiramatsu wrote:
> On Fri, 27 Oct 2017 21:30:24 +0800
> zhouchengming<zhouchengming1@huawei.com>  wrote:
>
>> On 2017/10/27 20:33, Borislav Petkov wrote:
>>> On Fri, Oct 27, 2017 at 07:42:45PM +0800, zhouchengming wrote:
>>>> This is a real bug happened on one of our machines, below is the calltrace.
>>>> We can see the trigger is at alternatives_text_reserved+0x20/0x80, and
>>>> encounter a deleted (poisoned) list_head.
>>> Looks like some out-of-tree, old kernel thing. We don't have
>>> mlx4_stats_sysfs_create() upstream and looking at the boot timestamps,
>>> it could be that register_jprobe() is not ready yet.
>> Yes, it's an out-of-tree module, loaded when boot kernel. register_kprobe()
>> maybe not ready yet, but the bug is not caused by it obviously.
>>
>>> Looking at the Code, though:
>>>
>>>     20:   74 59                   je     0x7b
>>>     22:   66 0f 1f 84 00 00 00    nopw   0x0(%rax,%rax,1)
>>>     29:   00 00
>>>     2b:*  48 3b 71 20             cmp    0x20(%rcx),%rsi<-- trapping instruction
>>>     2f:   72 3a                   jb     0x6b
>>>     31:   48 3b 79 28             cmp    0x28(%rcx),%rdi
>>>     35:   77 34                   ja     0x6b
>>>
>>> %rcx is 0xdead0000000000d0 and that is POISON_POINTER_DELTA + 0xd0 so
>>> that looks more like smp_alt_modules is not initialized yet but I could
>>> could very well be wrong because this is an old kernel. So trigger that
>>> with the upstream kernel without out of tree modules.
>> The smp_alt_modules is defined by LIST_HEAD, so it's initialized at start.
>>
>> A deleted list_head->next = LIST_POISON1 = 0xdead000000000000 + 0x100, then
>> container_of() to get the struct smp_alt_module: -0x30 = 0xdead0000000000d0
>>
>> Obviously, it's a deleted list_head, and I have explained clearly how it happen in
>> the patch comment.
> Ah, I see. It looks alternatives_text_reserved() bug at a glance.
> But simply adding smp_alt mutex to alternatives_text_reserved() causes
> ABBA deadlock in the kprobe's path.
> So your solution is to replace the smp_alt with text_mutex, since
> alternatives_text_reserved is x86 specific function.
>
> Hmm, let me see... I agree that will be a simple way to solve, but
> it also means we have 2 resources protected by text_mutex.

Yes, the smp_alt mutex must be held outside the text_mutex, this is a simpler way
to solve, because we will need another x86 specific interface if we want to hold
the smp_alt mutex.

But like you said, it's not good to use one text_mutex to protect 2 resources...
I hope there is any better way.

Thanks.

> Thank you,
>
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] kprobes, x86/alternatives: use text_mutex to protect smp_alt_modules
  2017-10-27  9:34 [PATCH] kprobes, x86/alternatives: use text_mutex to protect smp_alt_modules Zhou Chengming
  2017-10-27 11:15 ` Borislav Petkov
@ 2017-10-30  8:03 ` Masami Hiramatsu
  2017-10-31 21:59   ` Steven Rostedt
  1 sibling, 1 reply; 13+ messages in thread
From: Masami Hiramatsu @ 2017-10-30  8:03 UTC (permalink / raw)
  To: Zhou Chengming, Ingo Molnar
  Cc: ananth, anil.s.keshavamurthy, davem, hpa, tglx, bp, peterz,
	jkosina, rostedt, mjurczyk, x86, linux-kernel

On Fri, 27 Oct 2017 17:34:44 +0800
Zhou Chengming <zhouchengming1@huawei.com> wrote:

> Fixes: 2cfa197 "ftrace/alternatives: Introducing *_text_reserved
> functions"
> 
> We use alternatives_text_reserved() to check if the address is in
> the fixed pieces of alternative reserved, but the problem is that
> we don't hold the smp_alt mutex when call this function. So the list
> traversal may encounter a deleted list_head if another path is doing
> alternatives_smp_module_del().
> 
> One solution is that we can hold smp_alt mutex before call this
> function, but the difficult point is that the callers of this
> functions, arch_prepare_kprobe() and arch_prepare_optimized_kprobe(),
> are called inside the text_mutex. So we must hold smp_alt mutex
> before we go into these arch dependent code. But we can't now,
> the smp_alt mutex is the arch dependent part, only x86 has it.
> Maybe we can export another arch dependent callback to solve this.
> 
> But there is a simpler way to handle this problem. We can reuse the
> text_mutex to protect smp_alt_modules instead of using another mutex.
> And all the arch dependent checks of kprobes are inside the text_mutex,
> so it's safe now.

OK, I considered other ways but those may introduce unneeded 
complexity. So this simple solution is better to fix this bug.

Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>

Thank you,

> 
> Signed-off-by: Zhou Chengming <zhouchengming1@huawei.com>
> ---
>  arch/x86/kernel/alternative.c | 24 +++++++++++-------------
>  1 file changed, 11 insertions(+), 13 deletions(-)
> 
> diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
> index 3344d33..55abbaa 100644
> --- a/arch/x86/kernel/alternative.c
> +++ b/arch/x86/kernel/alternative.c
> @@ -442,7 +442,6 @@ static void alternatives_smp_lock(const s32 *start, const s32 *end,
>  {
>  	const s32 *poff;
>  
> -	mutex_lock(&text_mutex);
>  	for (poff = start; poff < end; poff++) {
>  		u8 *ptr = (u8 *)poff + *poff;
>  
> @@ -452,7 +451,6 @@ static void alternatives_smp_lock(const s32 *start, const s32 *end,
>  		if (*ptr == 0x3e)
>  			text_poke(ptr, ((unsigned char []){0xf0}), 1);
>  	}
> -	mutex_unlock(&text_mutex);
>  }
>  
>  static void alternatives_smp_unlock(const s32 *start, const s32 *end,
> @@ -460,7 +458,6 @@ static void alternatives_smp_unlock(const s32 *start, const s32 *end,
>  {
>  	const s32 *poff;
>  
> -	mutex_lock(&text_mutex);
>  	for (poff = start; poff < end; poff++) {
>  		u8 *ptr = (u8 *)poff + *poff;
>  
> @@ -470,7 +467,6 @@ static void alternatives_smp_unlock(const s32 *start, const s32 *end,
>  		if (*ptr == 0xf0)
>  			text_poke(ptr, ((unsigned char []){0x3E}), 1);
>  	}
> -	mutex_unlock(&text_mutex);
>  }
>  
>  struct smp_alt_module {
> @@ -489,8 +485,7 @@ struct smp_alt_module {
>  	struct list_head next;
>  };
>  static LIST_HEAD(smp_alt_modules);
> -static DEFINE_MUTEX(smp_alt);
> -static bool uniproc_patched = false;	/* protected by smp_alt */
> +static bool uniproc_patched = false;	/* protected by text_mutex */
>  
>  void __init_or_module alternatives_smp_module_add(struct module *mod,
>  						  char *name,
> @@ -499,7 +494,7 @@ void __init_or_module alternatives_smp_module_add(struct module *mod,
>  {
>  	struct smp_alt_module *smp;
>  
> -	mutex_lock(&smp_alt);
> +	mutex_lock(&text_mutex);
>  	if (!uniproc_patched)
>  		goto unlock;
>  
> @@ -526,14 +521,14 @@ void __init_or_module alternatives_smp_module_add(struct module *mod,
>  smp_unlock:
>  	alternatives_smp_unlock(locks, locks_end, text, text_end);
>  unlock:
> -	mutex_unlock(&smp_alt);
> +	mutex_unlock(&text_mutex);
>  }
>  
>  void __init_or_module alternatives_smp_module_del(struct module *mod)
>  {
>  	struct smp_alt_module *item;
>  
> -	mutex_lock(&smp_alt);
> +	mutex_lock(&text_mutex);
>  	list_for_each_entry(item, &smp_alt_modules, next) {
>  		if (mod != item->mod)
>  			continue;
> @@ -541,7 +536,7 @@ void __init_or_module alternatives_smp_module_del(struct module *mod)
>  		kfree(item);
>  		break;
>  	}
> -	mutex_unlock(&smp_alt);
> +	mutex_unlock(&text_mutex);
>  }
>  
>  void alternatives_enable_smp(void)
> @@ -551,7 +546,7 @@ void alternatives_enable_smp(void)
>  	/* Why bother if there are no other CPUs? */
>  	BUG_ON(num_possible_cpus() == 1);
>  
> -	mutex_lock(&smp_alt);
> +	mutex_lock(&text_mutex);
>  
>  	if (uniproc_patched) {
>  		pr_info("switching to SMP code\n");
> @@ -563,10 +558,13 @@ void alternatives_enable_smp(void)
>  					      mod->text, mod->text_end);
>  		uniproc_patched = false;
>  	}
> -	mutex_unlock(&smp_alt);
> +	mutex_unlock(&text_mutex);
>  }
>  
> -/* Return 1 if the address range is reserved for smp-alternatives */
> +/*
> + * Return 1 if the address range is reserved for smp-alternatives.
> + * Must hold text_mutex.
> + */
>  int alternatives_text_reserved(void *start, void *end)
>  {
>  	struct smp_alt_module *mod;
> -- 
> 1.8.3.1
> 


-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] kprobes, x86/alternatives: use text_mutex to protect smp_alt_modules
  2017-10-30  8:03 ` Masami Hiramatsu
@ 2017-10-31 21:59   ` Steven Rostedt
  2017-11-01  1:48     ` zhouchengming
  0 siblings, 1 reply; 13+ messages in thread
From: Steven Rostedt @ 2017-10-31 21:59 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Zhou Chengming, Ingo Molnar, ananth, anil.s.keshavamurthy, davem,
	hpa, tglx, bp, peterz, jkosina, mjurczyk, x86, linux-kernel

On Mon, 30 Oct 2017 17:03:23 +0900
Masami Hiramatsu <mhiramat@kernel.org> wrote:

> >  static LIST_HEAD(smp_alt_modules);
> > -static DEFINE_MUTEX(smp_alt);
> > -static bool uniproc_patched = false;	/* protected by smp_alt */
> > +static bool uniproc_patched = false;	/* protected by text_mutex */

We should also add a comment somewhere by the text_mutex that it is
protecting this on x86.

-- Steve

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] kprobes, x86/alternatives: use text_mutex to protect smp_alt_modules
  2017-10-31 21:59   ` Steven Rostedt
@ 2017-11-01  1:48     ` zhouchengming
  0 siblings, 0 replies; 13+ messages in thread
From: zhouchengming @ 2017-11-01  1:48 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Masami Hiramatsu, Ingo Molnar, ananth, anil.s.keshavamurthy,
	davem, hpa, tglx, bp, peterz, jkosina, mjurczyk, x86,
	linux-kernel

On 2017/11/1 5:59, Steven Rostedt wrote:
> On Mon, 30 Oct 2017 17:03:23 +0900
> Masami Hiramatsu<mhiramat@kernel.org>  wrote:
>
>>>   static LIST_HEAD(smp_alt_modules);
>>> -static DEFINE_MUTEX(smp_alt);
>>> -static bool uniproc_patched = false;	/* protected by smp_alt */
>>> +static bool uniproc_patched = false;	/* protected by text_mutex */
> We should also add a comment somewhere by the text_mutex that it is
> protecting this on x86.

Good, I will send a patch-v2 adding this comment.

Thanks!

> -- Steve
>
> .
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2017-11-01  1:54 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-10-27  9:34 [PATCH] kprobes, x86/alternatives: use text_mutex to protect smp_alt_modules Zhou Chengming
2017-10-27 11:15 ` Borislav Petkov
2017-10-27 11:42   ` zhouchengming
2017-10-27 12:33     ` Borislav Petkov
2017-10-27 13:30       ` zhouchengming
2017-10-28  8:43         ` Masami Hiramatsu
2017-10-28  9:51           ` zhouchengming
2017-10-27 14:15       ` Peter Zijlstra
2017-10-28  1:26         ` zhouchengming
2017-10-28  8:44           ` Masami Hiramatsu
2017-10-30  8:03 ` Masami Hiramatsu
2017-10-31 21:59   ` Steven Rostedt
2017-11-01  1:48     ` zhouchengming

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).