All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mark Rutland <mark.rutland@arm.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: x86@kernel.org, linux-kernel@vger.kernel.org, nslusarek@gmx.net,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: [PATCH] perf: Fix sys_perf_event_open() race against self
Date: Mon, 23 May 2022 11:48:43 +0100	[thread overview]
Message-ID: <Yotmi2kM3mm+iPPU@FVFF77S0Q05N> (raw)
In-Reply-To: <20220520183806.GV2578@worktop.programming.kicks-ass.net>

Hi Peter,

On Fri, May 20, 2022 at 08:38:06PM +0200, Peter Zijlstra wrote:
> 
> Norbert reported that it's possible to race sys_perf_event_open() such
> that the looser ends up in another context from the group leader,
> triggering many WARNs.

I'm hitting the same with my local arm64 Syzkaller instance, atop v5.18-rc6.

> The move_group case checks for races against itself, but the
> !move_group case doesn't, seemingly relying on the previous
> group_leader->ctx == ctx check. However, that check is racy due to not
> holding any locks at that time.
> 
> Therefore, re-check the result after acquiring locks and bailing
> if they no longer match.
> 
> Additionally, clarify the not_move_group case from the
> move_group-vs-move_group race.
> 
> Fixes: f63a8daa5812 ("perf: Fix event->ctx locking")
> Reported-by: Norbert Slusarek <nslusarek@gmx.net>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> ---
>  kernel/events/core.c |   14 ++++++++++++++
>  1 file changed, 14 insertions(+)
> 
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -12217,6 +12217,9 @@ SYSCALL_DEFINE5(perf_event_open,
>  		 * Do not allow to attach to a group in a different task
>  		 * or CPU context. If we're moving SW events, we'll fix
>  		 * this up later, so allow that.
> +		 *
> +		 * Racy, not holding group_leader->ctx->mutex, see comment with
> +		 * perf_event_ctx_lock().
>  		 */
>  		if (!move_group && group_leader->ctx != ctx)
>  			goto err_context;

I assume that given we say we're not holding the mutex that this is truly racy,
and a concurrent write can happen at any time. If that's the case, shouldn't we
be using *_ONCE() to manipulate perf_event::ctx?

... or could we remove the racy read entirely, and just rely on the later
check with ctx->mutex held? We can always reach that by chance anyway, so
there's not a functional need to bail out early, and consistently using the
later test removes some potential for introducing similar races in future.

FWIW, with this patch as-is applied atop v5.18-rc6, I no longer see the issue
in testing with the reproducer Syzkaller came up with. That would normally take
a few seconds, but it now survives several minutes.

For posterity, that reproducer was:

----8<----
r0 = perf_event_open$cgroup(&(0x7f0000000100)={0x1, 0x80, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, @perf_config_ext}, 0xffffffffffffffff, 0x0, 0xffffffffffffffff, 0x0)
r1 = dup(r0)
perf_event_open(&(0x7f00000001c0)={0x0, 0x80, 0x0, 0x0, 0x0, 0x0, 0x0, 0x2, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, @perf_bp={0x0}}, 0xffffffffffffffff, 0x0, r1, 0x0) (async)
perf_event_open$cgroup(&(0x7f0000000100)={0x1, 0x80, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x10, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, @perf_bp={0x0, 0x5}, 0x0, 0x0, 0x0, 0x0, 0xfffffffffffffffe, 0x0, 0x0, 0x0, 0x0, 0x0, 0x200000000000000}, 0xffffffffffffffff, 0x0, r1, 0x0)
---->8----

Thanks,
Mark.

> @@ -12282,6 +12285,7 @@ SYSCALL_DEFINE5(perf_event_open,
>  			} else {
>  				perf_event_ctx_unlock(group_leader, gctx);
>  				move_group = 0;
> +				goto not_move_group;
>  			}
>  		}
>  
> @@ -12298,7 +12302,17 @@ SYSCALL_DEFINE5(perf_event_open,
>  		}
>  	} else {
>  		mutex_lock(&ctx->mutex);
> +
> +		/*
> +		 * Now that we hold ctx->lock, (re)validate group_leader->ctx == ctx,
> +		 * see the group_leader && !move_group test earlier.
> +		 */
> +		if (group_leader && group_leader->ctx != ctx) {
> +			err = -EINVAL;
> +			goto err_locked;
> +		}
>  	}
> +not_move_group:
>  
>  	if (ctx->task == TASK_TOMBSTONE) {
>  		err = -ESRCH;
> 

      parent reply	other threads:[~2022-05-23 10:48 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-20 18:38 [PATCH] perf: Fix sys_perf_event_open() race against self Peter Zijlstra
2022-05-23 10:33 ` Ravi Bangoria
2022-11-16 22:30   ` Borislav Petkov
2022-11-17 10:33     ` Ravi Bangoria
2022-11-17 10:40       ` Borislav Petkov
2022-11-17 10:59         ` Ravi Bangoria
2022-11-17 11:59           ` Borislav Petkov
2022-11-17 13:38             ` Ravi Bangoria
2022-05-23 10:48 ` Mark Rutland [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Yotmi2kM3mm+iPPU@FVFF77S0Q05N \
    --to=mark.rutland@arm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nslusarek@gmx.net \
    --cc=peterz@infradead.org \
    --cc=torvalds@linux-foundation.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.