linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] perf/core: Install cgroup event via IPI
@ 2020-01-16 17:25 Song Liu
  2020-01-20  8:50 ` Peter Zijlstra
  0 siblings, 1 reply; 3+ messages in thread
From: Song Liu @ 2020-01-16 17:25 UTC (permalink / raw)
  To: linux-kernel
  Cc: kernel-team, Song Liu, Andi Kleen, Peter Zijlstra,
	Alexander Shishkin, Arnaldo Carvalho de Melo, Jiri Olsa,
	Namhyung Kim, Thomas Gleixner

cgroup events in OFF state cannot be installed without IPI, otherwise, it
may trigger the following calltrace with CONFIG_DEBUG_LIST:

[   31.776974] ------------[ cut here ]------------
[   31.777570] list_add double add: new=ffff888ff7cf0db0, prev=ffff888ff7ce82f0, next=ffff888ff7cf0db0.
[   31.778737] WARNING: CPU: 3 PID: 1186 at lib/list_debug.c:31 __list_add_valid+0x67/0x70
[   31.779745] Modules linked in:
[   31.780138] CPU: 3 PID: 1186 Comm: perf Tainted: G        W         5.5.0-rc6+ #3962
[   31.781125] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014
[   31.782199] RIP: 0010:__list_add_valid+0x67/0x70
[   31.782774] Code: c1 4c 89 c6 48 c7 c7 f8 cd 57 82 e8 43 a0 a4 ff 0f 0b 31 c0 c3 48 89 f2 4c 89 c1 48 89 fe 48 c7 c7 48 ce 57 82 e8 29 a0 a4 ff <0f> 0b 31 c0 c3 0f 1f 40 00 48 b9 00 01 00 00 00 00 ad de 48 8b 07
[   31.785066] RSP: 0018:ffffc900013ffdb8 EFLAGS: 00010086
[   31.785713] RAX: 0000000000000000 RBX: ffff888d5bb4a000 RCX: 0000000000000007
[   31.786596] RDX: 0000000000000000 RSI: 0000000000000086 RDI: ffff888ff7cd8870
[   31.787471] RBP: ffff888ff7db0c40 R08: 0000000000001b3b R09: 0000000000000067
[   31.788352] R10: ffff888ff7db0c90 R11: ffffc900013ffc65 R12: ffff888ff7cf0c40
[   31.789229] R13: ffff888ff7ce82f0 R14: ffff888ff7cf0db0 R15: ffff888ff7cf0db0
[   31.790115] FS:  00007f14cd1557c0(0000) GS:ffff888ff7cc0000(0000) knlGS:0000000000000000
[   31.791111] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   31.791824] CR2: 00007f675e3ace60 CR3: 0000000d55e76006 CR4: 00000000003606e0
[   31.792703] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   31.793583] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   31.794461] Call Trace:
[   31.794776]  list_add_event+0xe5/0x230
[   31.795247]  perf_install_in_context+0x155/0x1f0
[   31.795819]  ? anon_inode_getfile+0x7f/0xd0
[   31.796342]  __do_sys_perf_event_open+0x323/0xd60
[   31.796921]  do_syscall_64+0x55/0x1c0
[   31.797384]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[   31.798008] RIP: 0033:0x7f14ca4a33e9
[   31.798460] Code: 01 00 48 81 c4 80 00 00 00 e9 f1 fe ff ff 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 6f 9a 2c 00 f7 d8 64 89 01 48
[   31.800729] RSP: 002b:00007ffeaf16e5a8 EFLAGS: 00000246 ORIG_RAX: 000000000000012a
[   31.801655] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f14ca4a33e9
[   31.802536] RDX: 0000000000000006 RSI: 0000000000000003 RDI: 0000000001053fd8
[   31.803412] RBP: 00007ffeaf16e670 R08: 000000000000000c R09: 000000000000000c
[   31.804286] R10: 00000000ffffffff R11: 0000000000000246 R12: 00000000ffffffff
[   31.805160] R13: 0000000000000006 R14: 0000000001053af0 R15: 0000000001053fc0
[   31.806031] ---[ end trace ef48f280582d1897 ]---

To reproduce this, we can simply run:
   perf stat -e cs -a &
   perf stat -e cs -G anycgroup

Fix this by using IPI to install cgroup events.

Fixes: db0503e4f675 ("perf/core: Optimize perf_install_in_event()")
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Song Liu <songliubraving@fb.com>
---
 kernel/events/core.c | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index a1f8bde19b56..36e8fe27e2a1 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -2682,14 +2682,18 @@ perf_install_in_context(struct perf_event_context *ctx,
 	smp_store_release(&event->ctx, ctx);
 
 	/*
-	 * perf_event_attr::disabled events will not run and can be initialized
-	 * without IPI. Except when this is the first event for the context, in
-	 * that case we need the magic of the IPI to set ctx->is_active.
+	 * perf_event_attr::disabled events will not run and can be
+	 * initialized without IPI. Except:
+	 *   1. when this is the first event for the context, in that case
+	 *      we need the magic of the IPI to set ctx->is_active;
+	 *   2. cgroup event in OFF state, because it is installed in the
+	 *      cpuctx.
 	 *
 	 * The IOC_ENABLE that is sure to follow the creation of a disabled
 	 * event will issue the IPI and reprogram the hardware.
 	 */
-	if (__perf_effective_state(event) == PERF_EVENT_STATE_OFF && ctx->nr_events) {
+	if (__perf_effective_state(event) == PERF_EVENT_STATE_OFF &&
+	    !is_cgroup_event(event) && ctx->nr_events) {
 		raw_spin_lock_irq(&ctx->lock);
 		if (ctx->task == TASK_TOMBSTONE) {
 			raw_spin_unlock_irq(&ctx->lock);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] perf/core: Install cgroup event via IPI
  2020-01-16 17:25 [PATCH] perf/core: Install cgroup event via IPI Song Liu
@ 2020-01-20  8:50 ` Peter Zijlstra
  2020-01-21 18:38   ` Song Liu
  0 siblings, 1 reply; 3+ messages in thread
From: Peter Zijlstra @ 2020-01-20  8:50 UTC (permalink / raw)
  To: Song Liu
  Cc: linux-kernel, kernel-team, Andi Kleen, Alexander Shishkin,
	Arnaldo Carvalho de Melo, Jiri Olsa, Namhyung Kim,
	Thomas Gleixner

On Thu, Jan 16, 2020 at 09:25:55AM -0800, Song Liu wrote:
> cgroup events in OFF state cannot be installed without IPI, otherwise, it
> may trigger the following calltrace with CONFIG_DEBUG_LIST:
> 
> [   31.776974] ------------[ cut here ]------------
> [   31.777570] list_add double add: new=ffff888ff7cf0db0, prev=ffff888ff7ce82f0, next=ffff888ff7cf0db0.
> [   31.778737] WARNING: CPU: 3 PID: 1186 at lib/list_debug.c:31 __list_add_valid+0x67/0x70
> [   31.779745] Modules linked in:
> [   31.780138] CPU: 3 PID: 1186 Comm: perf Tainted: G        W         5.5.0-rc6+ #3962
> [   31.781125] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014
> [   31.782199] RIP: 0010:__list_add_valid+0x67/0x70

> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index a1f8bde19b56..36e8fe27e2a1 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -2682,14 +2682,18 @@ perf_install_in_context(struct perf_event_context *ctx,
>  	smp_store_release(&event->ctx, ctx);
>  
>  	/*
> -	 * perf_event_attr::disabled events will not run and can be initialized
> -	 * without IPI. Except when this is the first event for the context, in
> -	 * that case we need the magic of the IPI to set ctx->is_active.
> +	 * perf_event_attr::disabled events will not run and can be
> +	 * initialized without IPI. Except:
> +	 *   1. when this is the first event for the context, in that case
> +	 *      we need the magic of the IPI to set ctx->is_active;
> +	 *   2. cgroup event in OFF state, because it is installed in the
> +	 *      cpuctx.
>  	 *
>  	 * The IOC_ENABLE that is sure to follow the creation of a disabled
>  	 * event will issue the IPI and reprogram the hardware.
>  	 */
> -	if (__perf_effective_state(event) == PERF_EVENT_STATE_OFF && ctx->nr_events) {
> +	if (__perf_effective_state(event) == PERF_EVENT_STATE_OFF &&
> +	    !is_cgroup_event(event) && ctx->nr_events) {
>  		raw_spin_lock_irq(&ctx->lock);
>  		if (ctx->task == TASK_TOMBSTONE) {
>  			raw_spin_unlock_irq(&ctx->lock);

I don't think this is right. Because cgroup events are always per-cpu
events, ctx == &cpuctx->ctx, so the locking should work out just fine.

What does appear to be the problem is that:

  add_event_to_ctx()
    list_update_cgroup_event()
      cpuctx = __get_cpu_context(ctx)

uses this_cpu_ptr() and we're now calling it from the 'wrong' CPU.

But I'm thinking the below should also work just fine, hmm?

---

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 2173c23c25b4..2c6134604811 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -951,9 +951,9 @@ list_update_cgroup_event(struct perf_event *event,
 
 	/*
 	 * Because cgroup events are always per-cpu events,
-	 * this will always be called from the right CPU.
+	 * @ctx == &cpuctx->cpu.
 	 */
-	cpuctx = __get_cpu_context(ctx);
+	cpuctx = container_of(ctx, struct perf_cpu_context, ctx);
 
 	/*
 	 * Since setting cpuctx->cgrp is conditional on the current @cgrp

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] perf/core: Install cgroup event via IPI
  2020-01-20  8:50 ` Peter Zijlstra
@ 2020-01-21 18:38   ` Song Liu
  0 siblings, 0 replies; 3+ messages in thread
From: Song Liu @ 2020-01-21 18:38 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, Kernel Team, Andi Kleen, Alexander Shishkin,
	Arnaldo Carvalho de Melo, Jiri Olsa, Namhyung Kim,
	Thomas Gleixner



> On Jan 20, 2020, at 12:50 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> 
> On Thu, Jan 16, 2020 at 09:25:55AM -0800, Song Liu wrote:
>> cgroup events in OFF state cannot be installed without IPI, otherwise, it
>> may trigger the following calltrace with CONFIG_DEBUG_LIST:
>> 
>> [   31.776974] ------------[ cut here ]------------
>> [   31.777570] list_add double add: new=ffff888ff7cf0db0, prev=ffff888ff7ce82f0, next=ffff888ff7cf0db0.
>> [   31.778737] WARNING: CPU: 3 PID: 1186 at lib/list_debug.c:31 __list_add_valid+0x67/0x70
>> [   31.779745] Modules linked in:
>> [   31.780138] CPU: 3 PID: 1186 Comm: perf Tainted: G        W         5.5.0-rc6+ #3962
>> [   31.781125] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014
>> [   31.782199] RIP: 0010:__list_add_valid+0x67/0x70
> 
>> diff --git a/kernel/events/core.c b/kernel/events/core.c
>> index a1f8bde19b56..36e8fe27e2a1 100644
>> --- a/kernel/events/core.c
>> +++ b/kernel/events/core.c
>> @@ -2682,14 +2682,18 @@ perf_install_in_context(struct perf_event_context *ctx,
>> 	smp_store_release(&event->ctx, ctx);
>> 
>> 	/*
>> -	 * perf_event_attr::disabled events will not run and can be initialized
>> -	 * without IPI. Except when this is the first event for the context, in
>> -	 * that case we need the magic of the IPI to set ctx->is_active.
>> +	 * perf_event_attr::disabled events will not run and can be
>> +	 * initialized without IPI. Except:
>> +	 *   1. when this is the first event for the context, in that case
>> +	 *      we need the magic of the IPI to set ctx->is_active;
>> +	 *   2. cgroup event in OFF state, because it is installed in the
>> +	 *      cpuctx.
>> 	 *
>> 	 * The IOC_ENABLE that is sure to follow the creation of a disabled
>> 	 * event will issue the IPI and reprogram the hardware.
>> 	 */
>> -	if (__perf_effective_state(event) == PERF_EVENT_STATE_OFF && ctx->nr_events) {
>> +	if (__perf_effective_state(event) == PERF_EVENT_STATE_OFF &&
>> +	    !is_cgroup_event(event) && ctx->nr_events) {
>> 		raw_spin_lock_irq(&ctx->lock);
>> 		if (ctx->task == TASK_TOMBSTONE) {
>> 			raw_spin_unlock_irq(&ctx->lock);
> 
> I don't think this is right. Because cgroup events are always per-cpu
> events, ctx == &cpuctx->ctx, so the locking should work out just fine.
> 
> What does appear to be the problem is that:
> 
>  add_event_to_ctx()
>    list_update_cgroup_event()
>      cpuctx = __get_cpu_context(ctx)
> 
> uses this_cpu_ptr() and we're now calling it from the 'wrong' CPU.

Yes, this is because add_event_to_ctx() is called from wrong CPU. 

> 
> But I'm thinking the below should also work just fine, hmm?

And yes, this also fixes it. Thanks!

Would you send official PATCH?

Song


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2020-01-21 18:39 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-16 17:25 [PATCH] perf/core: Install cgroup event via IPI Song Liu
2020-01-20  8:50 ` Peter Zijlstra
2020-01-21 18:38   ` Song Liu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).