All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] perf: fix GPF in perf_cgroup_switch()
@ 2022-02-04  0:40 Song Liu
  2022-02-04  0:54 ` Rik van Riel
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Song Liu @ 2022-02-04  0:40 UTC (permalink / raw)
  To: linux-kernel; +Cc: kernel-team, peterz, Song Liu, Yonghong Song, Rik van Riel

GPF is observed in perf_cgroup_switch():

[ 2683.232477] general protection fault, probably for non-canonical address 0xdeacffffffffff90: 0000 [#1] SMP
[ 2683.251802] CPU: 30 PID: 0 Comm: swapper/30 Kdump: loaded Tainted: G S
[ 2683.273726] Hardware name: Quanta Twin Lakes MP/Twin Lakes Passive MP,
[ 2683.291129] RIP: 0010:perf_cgroup_switch+0xc0/0x170
[ 2683.300889] Code: 85 ff ff 48 8b 83 00 01 00 00 48 85 c0 74 04 c6 40 08 \
         00 c6 43 08 00 48 8b 83 70 01 00 00 48 8d 98 90 fe ff ff 48 39 c5 \
         74 7d <8b> 83 e4 00 00 00 85 c0 0f 84 89 00 00 00 4c 8b bb 00 01 00 00 48
[ 2683.338455] RSP: 0018:ffffc9000021fdb0 EFLAGS: 00010002
[ 2683.348904] RAX: dead000000000100 RBX: deacffffffffff90 RCX: 000000000000038f
[ 2683.363176] RDX: 0000000000000007 RSI: 0000000000000400 RDI: 0000000000000000
[ 2683.377447] RBP: ffff88903ffa77b0 R08: 0000000300000003 R09: 0000000000000004
[ 2683.391718] R10: 0000000000000003 R11: 0000000000000001 R12: 0000000000000002
[ 2683.405989] R13: 0000000000000000 R14: ffff8881013fdc00 R15: 0000000000000000
[ 2683.420261] FS:  0000000000000000(0000) GS:ffff88903ff80000(0000) knlGS:0000000000000000
[ 2683.436446] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2683.447937] CR2: 00007ffafb956020 CR3: 0000000141cd7005 CR4: 00000000007706e0
[ 2683.462209] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 2683.476481] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 2683.490752] PKRU: 55555554
[ 2683.496160] Call Trace:
[ 2683.501048]  __perf_event_task_sched_in+0xb3/0x200
[ 2683.510632]  finish_task_switch+0x186/0x270
[ 2683.518999]  __schedule+0x3b1/0x850
[ 2683.525973]  ? cpuidle_enter_state+0xa7/0x340
[ 2683.534687]  ? update_ts_time_stats+0x51/0x70
[ 2683.543399]  schedule_idle+0x1e/0x40
[ 2683.550548]  do_idle+0x148/0x200
[ 2683.557001]  cpu_startup_entry+0x19/0x20
[ 2683.564843]  start_secondary+0x104/0x140
[ 2683.572688]  secondary_startup_64_no_verify+0xb0/0xbb

which indicates list corruption on cgrp_cpuctx_list. This happens on the
following path:

  perf_cgroup_switch: list_for_each_entry(cgrp_cpuctx_list)
      cpu_ctx_sched_in
         ctx_sched_in
            ctx_pinned_sched_in
              merge_sched_in
                  perf_cgroup_event_disable: remove the event from the list

To repro this on Intel CPUs:

  /* occupy all counters with pinned events (watchdog uses another) */
  perf stat -e cycles:D,cycles:D,cycles:D,cycles:D,cycles:D -a &
  /* add a pinned cgroup event */
  perf stat -e cycles:D -G my-cgroup
  /* GPF immediately */

Fix this with list_for_each_entry_safe().

Cc: Yonghong Song <yhs@fb.com>
Cc: Rik van Riel <riel@surriel.com>
Signed-off-by: Song Liu <song@kernel.org>
---
 kernel/events/core.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index afbf388a5176..46babdf76d8f 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -839,7 +839,7 @@ static DEFINE_PER_CPU(struct list_head, cgrp_cpuctx_list);
  */
 static void perf_cgroup_switch(struct task_struct *task, int mode)
 {
-	struct perf_cpu_context *cpuctx;
+	struct perf_cpu_context *cpuctx, *tmp;
 	struct list_head *list;
 	unsigned long flags;
 
@@ -850,7 +850,7 @@ static void perf_cgroup_switch(struct task_struct *task, int mode)
 	local_irq_save(flags);
 
 	list = this_cpu_ptr(&cgrp_cpuctx_list);
-	list_for_each_entry(cpuctx, list, cgrp_cpuctx_entry) {
+	list_for_each_entry_safe(cpuctx, tmp, list, cgrp_cpuctx_entry) {
 		WARN_ON_ONCE(cpuctx->ctx.nr_cgroups == 0);
 
 		perf_ctx_lock(cpuctx, cpuctx->task_ctx);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] perf: fix GPF in perf_cgroup_switch()
  2022-02-04  0:40 [PATCH] perf: fix GPF in perf_cgroup_switch() Song Liu
@ 2022-02-04  0:54 ` Rik van Riel
  2022-02-06 21:33 ` Peter Zijlstra
  2022-02-07 13:51 ` [tip: perf/urgent] perf: Fix list corruption " tip-bot2 for Song Liu
  2 siblings, 0 replies; 5+ messages in thread
From: Rik van Riel @ 2022-02-04  0:54 UTC (permalink / raw)
  To: Song Liu, linux-kernel; +Cc: kernel-team, peterz, Yonghong Song

[-- Attachment #1: Type: text/plain, Size: 296 bytes --]

On Thu, 2022-02-03 at 16:40 -0800, Song Liu wrote:
> Fix this with list_for_each_entry_safe().
> 
> Cc: Yonghong Song <yhs@fb.com>
> Cc: Rik van Riel <riel@surriel.com>
> Signed-off-by: Song Liu <song@kernel.org>

Reviewed-by: Rik van Riel <riel@surriel.com>

-- 
All Rights Reversed.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] perf: fix GPF in perf_cgroup_switch()
  2022-02-04  0:40 [PATCH] perf: fix GPF in perf_cgroup_switch() Song Liu
  2022-02-04  0:54 ` Rik van Riel
@ 2022-02-06 21:33 ` Peter Zijlstra
  2022-02-06 22:04   ` Song Liu
  2022-02-07 13:51 ` [tip: perf/urgent] perf: Fix list corruption " tip-bot2 for Song Liu
  2 siblings, 1 reply; 5+ messages in thread
From: Peter Zijlstra @ 2022-02-06 21:33 UTC (permalink / raw)
  To: Song Liu; +Cc: linux-kernel, kernel-team, Yonghong Song, Rik van Riel

On Thu, Feb 03, 2022 at 04:40:57PM -0800, Song Liu wrote:
> GPF is observed in perf_cgroup_switch():
> 
> [ 2683.232477] general protection fault, probably for non-canonical address 0xdeacffffffffff90: 0000 [#1] SMP
> [ 2683.251802] CPU: 30 PID: 0 Comm: swapper/30 Kdump: loaded Tainted: G S
> [ 2683.273726] Hardware name: Quanta Twin Lakes MP/Twin Lakes Passive MP,
> [ 2683.291129] RIP: 0010:perf_cgroup_switch+0xc0/0x170
> [ 2683.300889] Code: 85 ff ff 48 8b 83 00 01 00 00 48 85 c0 74 04 c6 40 08 \
>          00 c6 43 08 00 48 8b 83 70 01 00 00 48 8d 98 90 fe ff ff 48 39 c5 \
>          74 7d <8b> 83 e4 00 00 00 85 c0 0f 84 89 00 00 00 4c 8b bb 00 01 00 00 48
> [ 2683.338455] RSP: 0018:ffffc9000021fdb0 EFLAGS: 00010002
> [ 2683.348904] RAX: dead000000000100 RBX: deacffffffffff90 RCX: 000000000000038f
> [ 2683.363176] RDX: 0000000000000007 RSI: 0000000000000400 RDI: 0000000000000000
> [ 2683.377447] RBP: ffff88903ffa77b0 R08: 0000000300000003 R09: 0000000000000004
> [ 2683.391718] R10: 0000000000000003 R11: 0000000000000001 R12: 0000000000000002
> [ 2683.405989] R13: 0000000000000000 R14: ffff8881013fdc00 R15: 0000000000000000
> [ 2683.420261] FS:  0000000000000000(0000) GS:ffff88903ff80000(0000) knlGS:0000000000000000
> [ 2683.436446] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 2683.447937] CR2: 00007ffafb956020 CR3: 0000000141cd7005 CR4: 00000000007706e0
> [ 2683.462209] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 2683.476481] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [ 2683.490752] PKRU: 55555554
> [ 2683.496160] Call Trace:
> [ 2683.501048]  __perf_event_task_sched_in+0xb3/0x200
> [ 2683.510632]  finish_task_switch+0x186/0x270
> [ 2683.518999]  __schedule+0x3b1/0x850
> [ 2683.525973]  ? cpuidle_enter_state+0xa7/0x340
> [ 2683.534687]  ? update_ts_time_stats+0x51/0x70
> [ 2683.543399]  schedule_idle+0x1e/0x40
> [ 2683.550548]  do_idle+0x148/0x200
> [ 2683.557001]  cpu_startup_entry+0x19/0x20
> [ 2683.564843]  start_secondary+0x104/0x140
> [ 2683.572688]  secondary_startup_64_no_verify+0xb0/0xbb

Please, don't put splats in changelogs like this, simply put the
relevant information in. Which you've already done below, so the above
is completely redundant.

> 
> which indicates list corruption on cgrp_cpuctx_list. This happens on the
> following path:
> 
>   perf_cgroup_switch: list_for_each_entry(cgrp_cpuctx_list)
>       cpu_ctx_sched_in
>          ctx_sched_in
>             ctx_pinned_sched_in
>               merge_sched_in
>                   perf_cgroup_event_disable: remove the event from the list
> 
> To repro this on Intel CPUs:
> 
>   /* occupy all counters with pinned events (watchdog uses another) */
>   perf stat -e cycles:D,cycles:D,cycles:D,cycles:D,cycles:D -a &
>   /* add a pinned cgroup event */
>   perf stat -e cycles:D -G my-cgroup
>   /* GPF immediately */

Luckily this requires root, but please don't hand reproducers like this
to the kiddies :/

> Fix this with list_for_each_entry_safe().

*sigh*

You forgot Fixes, I'm thinking this is due to commit: 058fe1c0440e6.

With all that, I seem to end up with the below.

---
Subject: perf: Fix list corruption in perf_cgroup_switch()
From: Song Liu <song@kernel.org>
Date: Thu, 3 Feb 2022 16:40:57 -0800

From: Song Liu <song@kernel.org>

There's list corruption on cgrp_cpuctx_list. This happens on the
following path:

  perf_cgroup_switch: list_for_each_entry(cgrp_cpuctx_list)
      cpu_ctx_sched_in
         ctx_sched_in
            ctx_pinned_sched_in
              merge_sched_in
                  perf_cgroup_event_disable: remove the event from the list

Use list_for_each_entry_safe() to allow removing an entry during
iteration.

Fixes: 058fe1c0440e ("perf/core: Make cgroup switch visit only cpuctxs with cgroup events")
Signed-off-by: Song Liu <song@kernel.org>
Reviewed-by: Rik van Riel <riel@surriel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20220204004057.2961252-1-song@kernel.org
---

does that work for you?

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] perf: fix GPF in perf_cgroup_switch()
  2022-02-06 21:33 ` Peter Zijlstra
@ 2022-02-06 22:04   ` Song Liu
  0 siblings, 0 replies; 5+ messages in thread
From: Song Liu @ 2022-02-06 22:04 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Song Liu, linux-kernel, Kernel Team, Yonghong Song, Rik van Riel


> On Feb 6, 2022, at 1:34 PM, Peter Zijlstra <peterz@infradead.org> wrote:
> 
> On Thu, Feb 03, 2022 at 04:40:57PM -0800, Song Liu wrote:
>> GPF is observed in perf_cgroup_switch():
>> 
>> [ 2683.232477] general protection fault, probably for non-canonical address 0xdeacffffffffff90: 0000 [#1] SMP
>> [ 2683.251802] CPU: 30 PID: 0 Comm: swapper/30 Kdump: loaded Tainted: G S
>> [ 2683.273726] Hardware name: Quanta Twin Lakes MP/Twin Lakes Passive MP,
>> [ 2683.291129] RIP: 0010:perf_cgroup_switch+0xc0/0x170
>> [ 2683.300889] Code: 85 ff ff 48 8b 83 00 01 00 00 48 85 c0 74 04 c6 40 08 \
>>         00 c6 43 08 00 48 8b 83 70 01 00 00 48 8d 98 90 fe ff ff 48 39 c5 \
>>         74 7d <8b> 83 e4 00 00 00 85 c0 0f 84 89 00 00 00 4c 8b bb 00 01 00 00 48
>> [ 2683.338455] RSP: 0018:ffffc9000021fdb0 EFLAGS: 00010002
>> [ 2683.348904] RAX: dead000000000100 RBX: deacffffffffff90 RCX: 000000000000038f
>> [ 2683.363176] RDX: 0000000000000007 RSI: 0000000000000400 RDI: 0000000000000000
>> [ 2683.377447] RBP: ffff88903ffa77b0 R08: 0000000300000003 R09: 0000000000000004
>> [ 2683.391718] R10: 0000000000000003 R11: 0000000000000001 R12: 0000000000000002
>> [ 2683.405989] R13: 0000000000000000 R14: ffff8881013fdc00 R15: 0000000000000000
>> [ 2683.420261] FS:  0000000000000000(0000) GS:ffff88903ff80000(0000) knlGS:0000000000000000
>> [ 2683.436446] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [ 2683.447937] CR2: 00007ffafb956020 CR3: 0000000141cd7005 CR4: 00000000007706e0
>> [ 2683.462209] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> [ 2683.476481] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>> [ 2683.490752] PKRU: 55555554
>> [ 2683.496160] Call Trace:
>> [ 2683.501048]  __perf_event_task_sched_in+0xb3/0x200
>> [ 2683.510632]  finish_task_switch+0x186/0x270
>> [ 2683.518999]  __schedule+0x3b1/0x850
>> [ 2683.525973]  ? cpuidle_enter_state+0xa7/0x340
>> [ 2683.534687]  ? update_ts_time_stats+0x51/0x70
>> [ 2683.543399]  schedule_idle+0x1e/0x40
>> [ 2683.550548]  do_idle+0x148/0x200
>> [ 2683.557001]  cpu_startup_entry+0x19/0x20
>> [ 2683.564843]  start_secondary+0x104/0x140
>> [ 2683.572688]  secondary_startup_64_no_verify+0xb0/0xbb
> 
> Please, don't put splats in changelogs like this, simply put the
> relevant information in. Which you've already done below, so the above
> is completely redundant.
> 
>> 
>> which indicates list corruption on cgrp_cpuctx_list. This happens on the
>> following path:
>> 
>>  perf_cgroup_switch: list_for_each_entry(cgrp_cpuctx_list)
>>      cpu_ctx_sched_in
>>         ctx_sched_in
>>            ctx_pinned_sched_in
>>              merge_sched_in
>>                  perf_cgroup_event_disable: remove the event from the list
>> 
>> To repro this on Intel CPUs:
>> 
>>  /* occupy all counters with pinned events (watchdog uses another) */
>>  perf stat -e cycles:D,cycles:D,cycles:D,cycles:D,cycles:D -a &
>>  /* add a pinned cgroup event */
>>  perf stat -e cycles:D -G my-cgroup
>>  /* GPF immediately */
> 
> Luckily this requires root, but please don't hand reproducers like this
> to the kiddies :/
> 
>> Fix this with list_for_each_entry_safe().
> 
> *sigh*
> 
> You forgot Fixes, I'm thinking this is due to commit: 058fe1c0440e6.
> 
> With all that, I seem to end up with the below.
> 
> ---
> Subject: perf: Fix list corruption in perf_cgroup_switch()
> From: Song Liu <song@kernel.org>
> Date: Thu, 3 Feb 2022 16:40:57 -0800
> 
> From: Song Liu <song@kernel.org>
> 
> There's list corruption on cgrp_cpuctx_list. This happens on the
> following path:
> 
>  perf_cgroup_switch: list_for_each_entry(cgrp_cpuctx_list)
>      cpu_ctx_sched_in
>         ctx_sched_in
>            ctx_pinned_sched_in
>              merge_sched_in
>                  perf_cgroup_event_disable: remove the event from the list
> 
> Use list_for_each_entry_safe() to allow removing an entry during
> iteration.
> 
> Fixes: 058fe1c0440e ("perf/core: Make cgroup switch visit only cpuctxs with cgroup events")
> Signed-off-by: Song Liu <song@kernel.org>
> Reviewed-by: Rik van Riel <riel@surriel.com>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> Link: https://lkml.kernel.org/r/20220204004057.2961252-1-song@kernel.org
> ---
> 
> does that work for you?

Yes, this works! Thanks for fixing up things. 

Song

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [tip: perf/urgent] perf: Fix list corruption in perf_cgroup_switch()
  2022-02-04  0:40 [PATCH] perf: fix GPF in perf_cgroup_switch() Song Liu
  2022-02-04  0:54 ` Rik van Riel
  2022-02-06 21:33 ` Peter Zijlstra
@ 2022-02-07 13:51 ` tip-bot2 for Song Liu
  2 siblings, 0 replies; 5+ messages in thread
From: tip-bot2 for Song Liu @ 2022-02-07 13:51 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Song Liu, Rik van Riel, Peter Zijlstra (Intel), x86, linux-kernel

The following commit has been merged into the perf/urgent branch of tip:

Commit-ID:     5f4e5ce638e6a490b976ade4a40017b40abb2da0
Gitweb:        https://git.kernel.org/tip/5f4e5ce638e6a490b976ade4a40017b40abb2da0
Author:        Song Liu <song@kernel.org>
AuthorDate:    Thu, 03 Feb 2022 16:40:57 -08:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Sun, 06 Feb 2022 22:37:27 +01:00

perf: Fix list corruption in perf_cgroup_switch()

There's list corruption on cgrp_cpuctx_list. This happens on the
following path:

  perf_cgroup_switch: list_for_each_entry(cgrp_cpuctx_list)
      cpu_ctx_sched_in
         ctx_sched_in
            ctx_pinned_sched_in
              merge_sched_in
                  perf_cgroup_event_disable: remove the event from the list

Use list_for_each_entry_safe() to allow removing an entry during
iteration.

Fixes: 058fe1c0440e ("perf/core: Make cgroup switch visit only cpuctxs with cgroup events")
Signed-off-by: Song Liu <song@kernel.org>
Reviewed-by: Rik van Riel <riel@surriel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20220204004057.2961252-1-song@kernel.org
---
 kernel/events/core.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 57c7197..6859229 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -839,7 +839,7 @@ static DEFINE_PER_CPU(struct list_head, cgrp_cpuctx_list);
  */
 static void perf_cgroup_switch(struct task_struct *task, int mode)
 {
-	struct perf_cpu_context *cpuctx;
+	struct perf_cpu_context *cpuctx, *tmp;
 	struct list_head *list;
 	unsigned long flags;
 
@@ -850,7 +850,7 @@ static void perf_cgroup_switch(struct task_struct *task, int mode)
 	local_irq_save(flags);
 
 	list = this_cpu_ptr(&cgrp_cpuctx_list);
-	list_for_each_entry(cpuctx, list, cgrp_cpuctx_entry) {
+	list_for_each_entry_safe(cpuctx, tmp, list, cgrp_cpuctx_entry) {
 		WARN_ON_ONCE(cpuctx->ctx.nr_cgroups == 0);
 
 		perf_ctx_lock(cpuctx, cpuctx->task_ctx);

^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2022-02-07 14:21 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-04  0:40 [PATCH] perf: fix GPF in perf_cgroup_switch() Song Liu
2022-02-04  0:54 ` Rik van Riel
2022-02-06 21:33 ` Peter Zijlstra
2022-02-06 22:04   ` Song Liu
2022-02-07 13:51 ` [tip: perf/urgent] perf: Fix list corruption " tip-bot2 for Song Liu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.