linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* crash in perf_event_read
@ 2017-01-31  7:04 Alexei Starovoitov
  2017-01-31 10:27 ` Peter Zijlstra
  0 siblings, 1 reply; 5+ messages in thread
From: Alexei Starovoitov @ 2017-01-31  7:04 UTC (permalink / raw)
  To: Peter Zijlstra, linux-kernel

Hi Peter,

rarely I'm seeing the following crash:
[40196.164255] BUG: unable to handle kernel paging request at 000000000000a11a
[40196.179636] IP: perf_event_read+0xd3/0x1a0
[40196.188669] PGD 82e93a067
[40196.188670] PUD 7e1ddf067
[40196.194629] PMD 0
[40196.200589]
[40196.208284] Oops: 0000 [#1] SMP
[40196.208285] Modules linked in:
[40196.208299] CPU: 24 PID: 4423 Comm: dynoKernelMon Not tainted 4.10.0-rc5-01189-gc6e0ad0ee5b0 #599
[40196.208300] Hardware name: Quanta Mono Lake-M.2 SATA 20F20BU0270/Mono Lake-M.2 SATA, BIOS F20_3A12 10/24/2016
[40196.208301] task: ffff8807e3b65580 task.stack: ffffc90009748000
[40196.208302] RIP: 0010:perf_event_read+0xd3/0x1a0
[40196.208303] RSP: 0018:ffffc9000974bd48 EFLAGS: 00010202
[40196.208304] RAX: 000000000000a040 RBX: ffff8807b79fd000 RCX: 0000000000000000
[40196.208304] RDX: 0000000000000018 RSI: 0000000000000000 RDI: 00000000ffffffff
[40196.208305] RBP: ffffc9000974bd80 R08: 0000000000000000 R09: 0000000000000000
[40196.208305] R10: ffff8807cf8c7038 R11: 0000000000000000 R12: ffffc9000974bde8
[40196.208306] R13: 0000000000000000 R14: ffff8807b79fd000 R15: 00000000000004e0
[40196.208307] FS:  00007ff1b45ff700(0000) GS:ffff88085f400000(0000) knlGS:0000000000000000
[40196.208307] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[40196.208308] CR2: 000000000000a11a CR3: 0000000850298000 CR4: 00000000003406e0
[40196.208308] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[40196.208309] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[40196.208309] Call Trace:
[40196.208313]  ? __might_sleep+0x4a/0x80
[40196.208316]  perf_event_read_value+0x45/0x130
[40196.208318]  perf_read+0x84/0x2d0
[40196.208322]  __vfs_read+0x28/0x110
[40196.208325]  ? security_file_permission+0x9b/0xc0
[40196.208327]  vfs_read+0xa5/0x170
[40196.208327]  SyS_read+0x46/0xa0
[40196.208329]  do_syscall_64+0x4d/0xb0
[40196.208332]  entry_SYSCALL64_slow_path+0x25/0x25
[40196.208333] RIP: 0033:0x7ff1b6f5716d
[40196.208334] RSP: 002b:00007ff1b45fd080 EFLAGS: 00000293 ORIG_RAX: 0000000000000000
[40196.208335] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007ff1b6f5716d
[40196.208335] RDX: 0000000000000020 RSI: 00007ff1b45fd090 RDI: 0000000000000054
[40196.208336] RBP: 00007ff1b45fd0e0 R08: 00007ff1b65e45c0 R09: 00007ff1b45fc9b1
[40196.208336] R10: 0000000000000020 R11: 0000000000000293 R12: 0000000000000000
[40196.208337] R13: 00007ff1b5687000 R14: 00000004a817c7fe R15: 00000000000004e0
[40196.208337] Code: 60 02 00 00 74 30 48 63 cf 48 c7 c0 40 a0 00 00 48 8b 34 cd c0 93 d0 81 48 63 ca 48 8b 0c cd c0 93 d0 81 0f b7 8c 08 da 00 00 00 <66> 39 8c 30 da 00 00 00 0f 44 fa 48 8d 55 d0 b9 01 00 00 00 48
[40196.208353] RIP: perf_event_read+0xd3/0x1a0 RSP: ffffc9000974bd48
[40196.208353] CR2: 000000000000a11a

The RIP points to this asm:
   0xffffffff8115fc9b <+155>:   mov    %gs:0x7eeaa486(%rip),%edx        # 0xa128 <cpu_number>
   0xffffffff8115fca2 <+162>:   testb  $0x2,0x68(%rbx)
   0xffffffff8115fca6 <+166>:   mov    0x260(%rdi),%edi
   0xffffffff8115fcac <+172>:   je     0xffffffff8115fcde <perf_event_read+222>
   0xffffffff8115fcae <+174>:   movslq %edi,%rcx
   0xffffffff8115fcb1 <+177>:   mov    $0xa040,%rax
   0xffffffff8115fcb8 <+184>:   mov    -0x7e2f6c40(,%rcx,8),%rsi
   0xffffffff8115fcc0 <+192>:   movslq %edx,%rcx
   0xffffffff8115fcc3 <+195>:   mov    -0x7e2f6c40(,%rcx,8),%rcx
   0xffffffff8115fccb <+203>:   movzwl 0xda(%rax,%rcx,1),%ecx
   0xffffffff8115fcd3 <+211>:   cmp    %cx,0xda(%rax,%rsi,1)

which is this C code:
perf_event_read():
  local_cpu = get_cpu();
  cpu_to_read = find_cpu_to_read(event, local_cpu);
  put_cpu();

find_cpu_to_read():
  event_pkg =  topology_physical_package_id(event_cpu);
  local_pkg =  topology_physical_package_id(local_cpu);
  if (event_pkg == local_pkg)

If I read the asm correctly at the time of the crash
event_cpu == RDI == 00000000ffffffff
or in other words event->oncpu == -1
which I think is technically possible here.

Any suggestions how to fix this?
Happy to test any patches, though I don't know how to reproduce reliably.

Thanks

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: crash in perf_event_read
  2017-01-31  7:04 crash in perf_event_read Alexei Starovoitov
@ 2017-01-31 10:27 ` Peter Zijlstra
  2017-01-31 18:50   ` Alexei Starovoitov
                     ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Peter Zijlstra @ 2017-01-31 10:27 UTC (permalink / raw)
  To: Alexei Starovoitov; +Cc: linux-kernel, Ingo Molnar, eranian, davidcc

On Mon, Jan 30, 2017 at 11:04:08PM -0800, Alexei Starovoitov wrote:
> Hi Peter,
> 
> rarely I'm seeing the following crash:
> [40196.164255] BUG: unable to handle kernel paging request at 000000000000a11a
> [40196.179636] IP: perf_event_read+0xd3/0x1a0
> [40196.188669] PGD 82e93a067
> [40196.188670] PUD 7e1ddf067
> [40196.194629] PMD 0
> [40196.200589]
> [40196.208284] Oops: 0000 [#1] SMP
> [40196.208285] Modules linked in:
> [40196.208299] CPU: 24 PID: 4423 Comm: dynoKernelMon Not tainted 4.10.0-rc5-01189-gc6e0ad0ee5b0 #599
> [40196.208300] Hardware name: Quanta Mono Lake-M.2 SATA 20F20BU0270/Mono Lake-M.2 SATA, BIOS F20_3A12 10/24/2016
> [40196.208301] task: ffff8807e3b65580 task.stack: ffffc90009748000
> [40196.208302] RIP: 0010:perf_event_read+0xd3/0x1a0
> [40196.208303] RSP: 0018:ffffc9000974bd48 EFLAGS: 00010202
> [40196.208304] RAX: 000000000000a040 RBX: ffff8807b79fd000 RCX: 0000000000000000
> [40196.208304] RDX: 0000000000000018 RSI: 0000000000000000 RDI: 00000000ffffffff
> [40196.208305] RBP: ffffc9000974bd80 R08: 0000000000000000 R09: 0000000000000000
> [40196.208305] R10: ffff8807cf8c7038 R11: 0000000000000000 R12: ffffc9000974bde8
> [40196.208306] R13: 0000000000000000 R14: ffff8807b79fd000 R15: 00000000000004e0
> [40196.208307] FS:  00007ff1b45ff700(0000) GS:ffff88085f400000(0000) knlGS:0000000000000000
> [40196.208307] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [40196.208308] CR2: 000000000000a11a CR3: 0000000850298000 CR4: 00000000003406e0
> [40196.208308] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [40196.208309] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [40196.208309] Call Trace:
> [40196.208313]  ? __might_sleep+0x4a/0x80
> [40196.208316]  perf_event_read_value+0x45/0x130
> [40196.208318]  perf_read+0x84/0x2d0
> [40196.208322]  __vfs_read+0x28/0x110
> [40196.208325]  ? security_file_permission+0x9b/0xc0
> [40196.208327]  vfs_read+0xa5/0x170
> [40196.208327]  SyS_read+0x46/0xa0
> [40196.208329]  do_syscall_64+0x4d/0xb0
> [40196.208332]  entry_SYSCALL64_slow_path+0x25/0x25
> [40196.208333] RIP: 0033:0x7ff1b6f5716d
> [40196.208334] RSP: 002b:00007ff1b45fd080 EFLAGS: 00000293 ORIG_RAX: 0000000000000000
> [40196.208335] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007ff1b6f5716d
> [40196.208335] RDX: 0000000000000020 RSI: 00007ff1b45fd090 RDI: 0000000000000054
> [40196.208336] RBP: 00007ff1b45fd0e0 R08: 00007ff1b65e45c0 R09: 00007ff1b45fc9b1
> [40196.208336] R10: 0000000000000020 R11: 0000000000000293 R12: 0000000000000000
> [40196.208337] R13: 00007ff1b5687000 R14: 00000004a817c7fe R15: 00000000000004e0
> [40196.208337] Code: 60 02 00 00 74 30 48 63 cf 48 c7 c0 40 a0 00 00 48 8b 34 cd c0 93 d0 81 48 63 ca 48 8b 0c cd c0 93 d0 81 0f b7 8c 08 da 00 00 00 <66> 39 8c 30 da 00 00 00 0f 44 fa 48 8d 55 d0 b9 01 00 00 00 48
> [40196.208353] RIP: perf_event_read+0xd3/0x1a0 RSP: ffffc9000974bd48
> [40196.208353] CR2: 000000000000a11a
> 
> The RIP points to this asm:
>    0xffffffff8115fc9b <+155>:   mov    %gs:0x7eeaa486(%rip),%edx        # 0xa128 <cpu_number>
>    0xffffffff8115fca2 <+162>:   testb  $0x2,0x68(%rbx)
>    0xffffffff8115fca6 <+166>:   mov    0x260(%rdi),%edi
>    0xffffffff8115fcac <+172>:   je     0xffffffff8115fcde <perf_event_read+222>
>    0xffffffff8115fcae <+174>:   movslq %edi,%rcx
>    0xffffffff8115fcb1 <+177>:   mov    $0xa040,%rax
>    0xffffffff8115fcb8 <+184>:   mov    -0x7e2f6c40(,%rcx,8),%rsi
>    0xffffffff8115fcc0 <+192>:   movslq %edx,%rcx
>    0xffffffff8115fcc3 <+195>:   mov    -0x7e2f6c40(,%rcx,8),%rcx
>    0xffffffff8115fccb <+203>:   movzwl 0xda(%rax,%rcx,1),%ecx
>    0xffffffff8115fcd3 <+211>:   cmp    %cx,0xda(%rax,%rsi,1)
> 
> which is this C code:
> perf_event_read():
>   local_cpu = get_cpu();
>   cpu_to_read = find_cpu_to_read(event, local_cpu);
>   put_cpu();
> 
> find_cpu_to_read():
>   event_pkg =  topology_physical_package_id(event_cpu);
>   local_pkg =  topology_physical_package_id(local_cpu);
>   if (event_pkg == local_pkg)
> 
> If I read the asm correctly at the time of the crash
> event_cpu == RDI == 00000000ffffffff
> or in other words event->oncpu == -1
> which I think is technically possible here.
> 
> Any suggestions how to fix this?
> Happy to test any patches, though I don't know how to reproduce reliably.

Right you are, that's buggered.

Something like the below (compile tested only) ought to cure things I
think.

---
Subject: perf: Fix crash in perf_event_read()

Alexei had his box explode because doing read() on a package
(rapl/uncore) event that isn't currently scheduled in ends up doing an
out-of-bounds load.

Rework the code to more explicitly deal with event->oncpu being -1.

Reported-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: David Carrillo-Cisneros <davidcc@google.com>
Cc: Stephane Eranian <eranian@google.com>
Fixes: d6a2f9035bfc ("perf/core: Introduce PMU_EV_CAP_READ_ACTIVE_PKG")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 kernel/events/core.c | 23 ++++++++++++++---------
 1 file changed, 14 insertions(+), 9 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 88676ff98c0f..8d479be0c9a9 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -3538,14 +3538,15 @@ struct perf_read_data {
 	int ret;
 };
 
-static int find_cpu_to_read(struct perf_event *event, int local_cpu)
+static int __perf_event_read_cpu(struct perf_event *event, int event_cpu)
 {
-	int event_cpu = event->oncpu;
 	u16 local_pkg, event_pkg;
 
 	if (event->group_caps & PERF_EV_CAP_READ_ACTIVE_PKG) {
-		event_pkg =  topology_physical_package_id(event_cpu);
-		local_pkg =  topology_physical_package_id(local_cpu);
+		int local_cpu = smp_processor_id();
+
+		event_pkg = topology_physical_package_id(event_cpu);
+		local_pkg = topology_physical_package_id(local_cpu);
 
 		if (event_pkg == local_pkg)
 			return local_cpu;
@@ -3675,7 +3676,7 @@ u64 perf_event_read_local(struct perf_event *event)
 
 static int perf_event_read(struct perf_event *event, bool group)
 {
-	int ret = 0, cpu_to_read, local_cpu;
+	int event_cpu, ret = 0;
 
 	/*
 	 * If event is enabled and currently active on a CPU, update the
@@ -3688,9 +3689,12 @@ static int perf_event_read(struct perf_event *event, bool group)
 			.ret = 0,
 		};
 
-		local_cpu = get_cpu();
-		cpu_to_read = find_cpu_to_read(event, local_cpu);
-		put_cpu();
+		event_cpu = READ_ONCE(event->oncpu);
+		if ((unsigned)event_cpu >= nr_cpu_ids)
+			return 0;
+
+		preempt_disable();
+		event_cpu = __perf_event_read_cpu(event, event_cpu);
 
 		/*
 		 * Purposely ignore the smp_call_function_single() return
@@ -3702,7 +3706,8 @@ static int perf_event_read(struct perf_event *event, bool group)
 		 * Therefore, either way, we'll have an up-to-date event count
 		 * after this.
 		 */
-		(void)smp_call_function_single(cpu_to_read, __perf_event_read, &data, 1);
+		(void)smp_call_function_single(event_cpu, __perf_event_read, &data, 1);
+		preempt_enable();
 		ret = data.ret;
 	} else if (event->state == PERF_EVENT_STATE_INACTIVE) {
 		struct perf_event_context *ctx = event->ctx;

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: crash in perf_event_read
  2017-01-31 10:27 ` Peter Zijlstra
@ 2017-01-31 18:50   ` Alexei Starovoitov
  2017-02-02  7:43   ` David Carrillo-Cisneros
  2017-02-10  8:32   ` [tip:perf/urgent] perf/core: Fix crash in perf_event_read() tip-bot for Peter Zijlstra
  2 siblings, 0 replies; 5+ messages in thread
From: Alexei Starovoitov @ 2017-01-31 18:50 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: linux-kernel, Ingo Molnar, eranian, davidcc

On Tue, Jan 31, 2017 at 11:27:10AM +0100, Peter Zijlstra wrote:
> Subject: perf: Fix crash in perf_event_read()
> 
> Alexei had his box explode because doing read() on a package
> (rapl/uncore) event that isn't currently scheduled in ends up doing an
> out-of-bounds load.
> 
> Rework the code to more explicitly deal with event->oncpu being -1.
> 
> Reported-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
> Cc: David Carrillo-Cisneros <davidcc@google.com>
> Cc: Stephane Eranian <eranian@google.com>
> Fixes: d6a2f9035bfc ("perf/core: Introduce PMU_EV_CAP_READ_ACTIVE_PKG")
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>

Thanks for the quick fix!
Everything is clean after several hours.
Tested-by: Alexei Starovoitov <ast@kernel.org>

>  		/*
>  		 * Purposely ignore the smp_call_function_single() return
> @@ -3702,7 +3706,8 @@ static int perf_event_read(struct perf_event *event, bool group)
>  		 * Therefore, either way, we'll have an up-to-date event count
>  		 * after this.
>  		 */

in that comment above (not shown in this diff) there is a line:
"If event->oncpu isn't a valid CPU it means..."
it doesn't match the code due to merge conflict resolution in 2cc538412a.
Probably makes sense to fix it s/event->oncpu/event_cpu/ as part of this diff.

> -		(void)smp_call_function_single(cpu_to_read, __perf_event_read, &data, 1);
> +		(void)smp_call_function_single(event_cpu, __perf_event_read, &data, 1);
> +		preempt_enable();
>  		ret = data.ret;
>  	} else if (event->state == PERF_EVENT_STATE_INACTIVE) {
>  		struct perf_event_context *ctx = event->ctx;

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: crash in perf_event_read
  2017-01-31 10:27 ` Peter Zijlstra
  2017-01-31 18:50   ` Alexei Starovoitov
@ 2017-02-02  7:43   ` David Carrillo-Cisneros
  2017-02-10  8:32   ` [tip:perf/urgent] perf/core: Fix crash in perf_event_read() tip-bot for Peter Zijlstra
  2 siblings, 0 replies; 5+ messages in thread
From: David Carrillo-Cisneros @ 2017-02-02  7:43 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Alexei Starovoitov, linux-kernel, Ingo Molnar, Stephane Eranian

On Tue, Jan 31, 2017 at 2:27 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Mon, Jan 30, 2017 at 11:04:08PM -0800, Alexei Starovoitov wrote:
>> Hi Peter,
>>
>> rarely I'm seeing the following crash:
>> [40196.164255] BUG: unable to handle kernel paging request at 000000000000a11a
>> [40196.179636] IP: perf_event_read+0xd3/0x1a0
>> [40196.188669] PGD 82e93a067
>> [40196.188670] PUD 7e1ddf067
>> [40196.194629] PMD 0
>> [40196.200589]
>> [40196.208284] Oops: 0000 [#1] SMP
>> [40196.208285] Modules linked in:
>> [40196.208299] CPU: 24 PID: 4423 Comm: dynoKernelMon Not tainted 4.10.0-rc5-01189-gc6e0ad0ee5b0 #599
>> [40196.208300] Hardware name: Quanta Mono Lake-M.2 SATA 20F20BU0270/Mono Lake-M.2 SATA, BIOS F20_3A12 10/24/2016
>> [40196.208301] task: ffff8807e3b65580 task.stack: ffffc90009748000
>> [40196.208302] RIP: 0010:perf_event_read+0xd3/0x1a0
>> [40196.208303] RSP: 0018:ffffc9000974bd48 EFLAGS: 00010202
>> [40196.208304] RAX: 000000000000a040 RBX: ffff8807b79fd000 RCX: 0000000000000000
>> [40196.208304] RDX: 0000000000000018 RSI: 0000000000000000 RDI: 00000000ffffffff
>> [40196.208305] RBP: ffffc9000974bd80 R08: 0000000000000000 R09: 0000000000000000
>> [40196.208305] R10: ffff8807cf8c7038 R11: 0000000000000000 R12: ffffc9000974bde8
>> [40196.208306] R13: 0000000000000000 R14: ffff8807b79fd000 R15: 00000000000004e0
>> [40196.208307] FS:  00007ff1b45ff700(0000) GS:ffff88085f400000(0000) knlGS:0000000000000000
>> [40196.208307] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [40196.208308] CR2: 000000000000a11a CR3: 0000000850298000 CR4: 00000000003406e0
>> [40196.208308] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> [40196.208309] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>> [40196.208309] Call Trace:
>> [40196.208313]  ? __might_sleep+0x4a/0x80
>> [40196.208316]  perf_event_read_value+0x45/0x130
>> [40196.208318]  perf_read+0x84/0x2d0
>> [40196.208322]  __vfs_read+0x28/0x110
>> [40196.208325]  ? security_file_permission+0x9b/0xc0
>> [40196.208327]  vfs_read+0xa5/0x170
>> [40196.208327]  SyS_read+0x46/0xa0
>> [40196.208329]  do_syscall_64+0x4d/0xb0
>> [40196.208332]  entry_SYSCALL64_slow_path+0x25/0x25
>> [40196.208333] RIP: 0033:0x7ff1b6f5716d
>> [40196.208334] RSP: 002b:00007ff1b45fd080 EFLAGS: 00000293 ORIG_RAX: 0000000000000000
>> [40196.208335] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007ff1b6f5716d
>> [40196.208335] RDX: 0000000000000020 RSI: 00007ff1b45fd090 RDI: 0000000000000054
>> [40196.208336] RBP: 00007ff1b45fd0e0 R08: 00007ff1b65e45c0 R09: 00007ff1b45fc9b1
>> [40196.208336] R10: 0000000000000020 R11: 0000000000000293 R12: 0000000000000000
>> [40196.208337] R13: 00007ff1b5687000 R14: 00000004a817c7fe R15: 00000000000004e0
>> [40196.208337] Code: 60 02 00 00 74 30 48 63 cf 48 c7 c0 40 a0 00 00 48 8b 34 cd c0 93 d0 81 48 63 ca 48 8b 0c cd c0 93 d0 81 0f b7 8c 08 da 00 00 00 <66> 39 8c 30 da 00 00 00 0f 44 fa 48 8d 55 d0 b9 01 00 00 00 48
>> [40196.208353] RIP: perf_event_read+0xd3/0x1a0 RSP: ffffc9000974bd48
>> [40196.208353] CR2: 000000000000a11a
>>
>> The RIP points to this asm:
>>    0xffffffff8115fc9b <+155>:   mov    %gs:0x7eeaa486(%rip),%edx        # 0xa128 <cpu_number>
>>    0xffffffff8115fca2 <+162>:   testb  $0x2,0x68(%rbx)
>>    0xffffffff8115fca6 <+166>:   mov    0x260(%rdi),%edi
>>    0xffffffff8115fcac <+172>:   je     0xffffffff8115fcde <perf_event_read+222>
>>    0xffffffff8115fcae <+174>:   movslq %edi,%rcx
>>    0xffffffff8115fcb1 <+177>:   mov    $0xa040,%rax
>>    0xffffffff8115fcb8 <+184>:   mov    -0x7e2f6c40(,%rcx,8),%rsi
>>    0xffffffff8115fcc0 <+192>:   movslq %edx,%rcx
>>    0xffffffff8115fcc3 <+195>:   mov    -0x7e2f6c40(,%rcx,8),%rcx
>>    0xffffffff8115fccb <+203>:   movzwl 0xda(%rax,%rcx,1),%ecx
>>    0xffffffff8115fcd3 <+211>:   cmp    %cx,0xda(%rax,%rsi,1)
>>
>> which is this C code:
>> perf_event_read():
>>   local_cpu = get_cpu();
>>   cpu_to_read = find_cpu_to_read(event, local_cpu);
>>   put_cpu();
>>
>> find_cpu_to_read():
>>   event_pkg =  topology_physical_package_id(event_cpu);
>>   local_pkg =  topology_physical_package_id(local_cpu);
>>   if (event_pkg == local_pkg)
>>
>> If I read the asm correctly at the time of the crash
>> event_cpu == RDI == 00000000ffffffff
>> or in other words event->oncpu == -1
>> which I think is technically possible here.
>>
>> Any suggestions how to fix this?
>> Happy to test any patches, though I don't know how to reproduce reliably.
>
> Right you are, that's buggered.
>
> Something like the below (compile tested only) ought to cure things I
> think.
>
> ---
> Subject: perf: Fix crash in perf_event_read()
>
> Alexei had his box explode because doing read() on a package
> (rapl/uncore) event that isn't currently scheduled in ends up doing an
> out-of-bounds load.
>
> Rework the code to more explicitly deal with event->oncpu being -1.
>
Tested-by: David Carrillo-Cisneros <davidcc@google.com>
> Reported-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
> Cc: David Carrillo-Cisneros <davidcc@google.com>
> Cc: Stephane Eranian <eranian@google.com>
> Fixes: d6a2f9035bfc ("perf/core: Introduce PMU_EV_CAP_READ_ACTIVE_PKG")
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> ---
>  kernel/events/core.c | 23 ++++++++++++++---------
>  1 file changed, 14 insertions(+), 9 deletions(-)
>
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 88676ff98c0f..8d479be0c9a9 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -3538,14 +3538,15 @@ struct perf_read_data {
>         int ret;
>  };
>
> -static int find_cpu_to_read(struct perf_event *event, int local_cpu)
> +static int __perf_event_read_cpu(struct perf_event *event, int event_cpu)
>  {
> -       int event_cpu = event->oncpu;
>         u16 local_pkg, event_pkg;
>
>         if (event->group_caps & PERF_EV_CAP_READ_ACTIVE_PKG) {
> -               event_pkg =  topology_physical_package_id(event_cpu);
> -               local_pkg =  topology_physical_package_id(local_cpu);
> +               int local_cpu = smp_processor_id();
> +
> +               event_pkg = topology_physical_package_id(event_cpu);
> +               local_pkg = topology_physical_package_id(local_cpu);
>
>                 if (event_pkg == local_pkg)
>                         return local_cpu;
> @@ -3675,7 +3676,7 @@ u64 perf_event_read_local(struct perf_event *event)
>
>  static int perf_event_read(struct perf_event *event, bool group)
>  {
> -       int ret = 0, cpu_to_read, local_cpu;
> +       int event_cpu, ret = 0;
>
>         /*
>          * If event is enabled and currently active on a CPU, update the
> @@ -3688,9 +3689,12 @@ static int perf_event_read(struct perf_event *event, bool group)
>                         .ret = 0,
>                 };
>
> -               local_cpu = get_cpu();
> -               cpu_to_read = find_cpu_to_read(event, local_cpu);
> -               put_cpu();
> +               event_cpu = READ_ONCE(event->oncpu);
> +               if ((unsigned)event_cpu >= nr_cpu_ids)
> +                       return 0;

Yeah, I missed that one. Thanks for fixing it.

> +
> +               preempt_disable();
> +               event_cpu = __perf_event_read_cpu(event, event_cpu);
>
>                 /*
>                  * Purposely ignore the smp_call_function_single() return
> @@ -3702,7 +3706,8 @@ static int perf_event_read(struct perf_event *event, bool group)
>                  * Therefore, either way, we'll have an up-to-date event count
>                  * after this.
>                  */
> -               (void)smp_call_function_single(cpu_to_read, __perf_event_read, &data, 1);
> +               (void)smp_call_function_single(event_cpu, __perf_event_read, &data, 1);
> +               preempt_enable();
>                 ret = data.ret;
>         } else if (event->state == PERF_EVENT_STATE_INACTIVE) {
>                 struct perf_event_context *ctx = event->ctx;

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [tip:perf/urgent] perf/core: Fix crash in perf_event_read()
  2017-01-31 10:27 ` Peter Zijlstra
  2017-01-31 18:50   ` Alexei Starovoitov
  2017-02-02  7:43   ` David Carrillo-Cisneros
@ 2017-02-10  8:32   ` tip-bot for Peter Zijlstra
  2 siblings, 0 replies; 5+ messages in thread
From: tip-bot for Peter Zijlstra @ 2017-02-10  8:32 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: alexei.starovoitov, peterz, linux-kernel, torvalds, davidcc,
	mingo, hpa, tglx, ast

Commit-ID:  451d24d1e5f40bad000fa9abe36ddb16fc9928cb
Gitweb:     http://git.kernel.org/tip/451d24d1e5f40bad000fa9abe36ddb16fc9928cb
Author:     Peter Zijlstra <peterz@infradead.org>
AuthorDate: Tue, 31 Jan 2017 11:27:10 +0100
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Fri, 10 Feb 2017 09:04:50 +0100

perf/core: Fix crash in perf_event_read()

Alexei had his box explode because doing read() on a package
(rapl/uncore) event that isn't currently scheduled in ends up doing an
out-of-bounds load.

Rework the code to more explicitly deal with event->oncpu being -1.

Reported-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Tested-by: Alexei Starovoitov <ast@kernel.org>
Tested-by: David Carrillo-Cisneros <davidcc@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: eranian@google.com
Fixes: d6a2f9035bfc ("perf/core: Introduce PMU_EV_CAP_READ_ACTIVE_PKG")
Link: http://lkml.kernel.org/r/20170131102710.GL6515@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 kernel/events/core.c | 25 +++++++++++++++----------
 1 file changed, 15 insertions(+), 10 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index e5aaa80..e235bb9 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -3487,14 +3487,15 @@ struct perf_read_data {
 	int ret;
 };
 
-static int find_cpu_to_read(struct perf_event *event, int local_cpu)
+static int __perf_event_read_cpu(struct perf_event *event, int event_cpu)
 {
-	int event_cpu = event->oncpu;
 	u16 local_pkg, event_pkg;
 
 	if (event->group_caps & PERF_EV_CAP_READ_ACTIVE_PKG) {
-		event_pkg =  topology_physical_package_id(event_cpu);
-		local_pkg =  topology_physical_package_id(local_cpu);
+		int local_cpu = smp_processor_id();
+
+		event_pkg = topology_physical_package_id(event_cpu);
+		local_pkg = topology_physical_package_id(local_cpu);
 
 		if (event_pkg == local_pkg)
 			return local_cpu;
@@ -3624,7 +3625,7 @@ u64 perf_event_read_local(struct perf_event *event)
 
 static int perf_event_read(struct perf_event *event, bool group)
 {
-	int ret = 0, cpu_to_read, local_cpu;
+	int event_cpu, ret = 0;
 
 	/*
 	 * If event is enabled and currently active on a CPU, update the
@@ -3637,21 +3638,25 @@ static int perf_event_read(struct perf_event *event, bool group)
 			.ret = 0,
 		};
 
-		local_cpu = get_cpu();
-		cpu_to_read = find_cpu_to_read(event, local_cpu);
-		put_cpu();
+		event_cpu = READ_ONCE(event->oncpu);
+		if ((unsigned)event_cpu >= nr_cpu_ids)
+			return 0;
+
+		preempt_disable();
+		event_cpu = __perf_event_read_cpu(event, event_cpu);
 
 		/*
 		 * Purposely ignore the smp_call_function_single() return
 		 * value.
 		 *
-		 * If event->oncpu isn't a valid CPU it means the event got
+		 * If event_cpu isn't a valid CPU it means the event got
 		 * scheduled out and that will have updated the event count.
 		 *
 		 * Therefore, either way, we'll have an up-to-date event count
 		 * after this.
 		 */
-		(void)smp_call_function_single(cpu_to_read, __perf_event_read, &data, 1);
+		(void)smp_call_function_single(event_cpu, __perf_event_read, &data, 1);
+		preempt_enable();
 		ret = data.ret;
 	} else if (event->state == PERF_EVENT_STATE_INACTIVE) {
 		struct perf_event_context *ctx = event->ctx;

^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2017-02-10  8:35 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-31  7:04 crash in perf_event_read Alexei Starovoitov
2017-01-31 10:27 ` Peter Zijlstra
2017-01-31 18:50   ` Alexei Starovoitov
2017-02-02  7:43   ` David Carrillo-Cisneros
2017-02-10  8:32   ` [tip:perf/urgent] perf/core: Fix crash in perf_event_read() tip-bot for Peter Zijlstra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).