LKML Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH] perf: Fix race in perf_mmap_close function
@ 2020-09-10 10:41 Jiri Olsa
  2020-09-10 13:48 ` Namhyung Kim
  0 siblings, 1 reply; 13+ messages in thread
From: Jiri Olsa @ 2020-09-10 10:41 UTC (permalink / raw)
  To: Peter Zijlstra, Arnaldo Carvalho de Melo
  Cc: lkml, Ingo Molnar, Namhyung Kim, Alexander Shishkin,
	Michael Petlan, Wade Mealing

There's a possible race in perf_mmap_close when checking ring buffer's
mmap_count refcount value. The problem is that the mmap_count check is
not atomic because we call atomic_dec and atomic_read separately.

  perf_mmap_close:
  ...
   atomic_dec(&rb->mmap_count);
   ...
   if (atomic_read(&rb->mmap_count))
      goto out_put;

   <ring buffer detach>
   free_uid

out_put:
  ring_buffer_put(rb); /* could be last */

The race can happen when we have two (or more) events sharing same ring
buffer and they go through atomic_dec and then they both see 0 as refcount
value later in atomic_read. Then both will go on and execute code which
is meant to be run just once.

The code that detaches ring buffer is probably fine to be executed more
than once, but the problem is in calling free_uid, which will later on
demonstrate in related crashes and refcount warnings, like:

  refcount_t: addition on 0; use-after-free.
  ...
  RIP: 0010:refcount_warn_saturate+0x6d/0xf
  ...
  Call Trace:
  prepare_creds+0x190/0x1e0
  copy_creds+0x35/0x172
  copy_process+0x471/0x1a80
  _do_fork+0x83/0x3a0
  __do_sys_wait4+0x83/0x90
  __do_sys_clone+0x85/0xa0
  do_syscall_64+0x5b/0x1e0
  entry_SYSCALL_64_after_hwframe+0x44/0xa9

Using atomic decrease and check instead of separated calls.
This fixes CVE-2020-14351.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 kernel/events/core.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 7ed5248f0445..29313cc54d9e 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -5903,8 +5903,6 @@ static void perf_mmap_close(struct vm_area_struct *vma)
 		mutex_unlock(&event->mmap_mutex);
 	}
 
-	atomic_dec(&rb->mmap_count);
-
 	if (!atomic_dec_and_mutex_lock(&event->mmap_count, &event->mmap_mutex))
 		goto out_put;
 
@@ -5912,7 +5910,7 @@ static void perf_mmap_close(struct vm_area_struct *vma)
 	mutex_unlock(&event->mmap_mutex);
 
 	/* If there's still other mmap()s of this buffer, we're done. */
-	if (atomic_read(&rb->mmap_count))
+	if (!atomic_dec_and_test(&rb->mmap_count))
 		goto out_put;
 
 	/*
-- 
2.26.2


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] perf: Fix race in perf_mmap_close function
  2020-09-10 10:41 [PATCH] perf: Fix race in perf_mmap_close function Jiri Olsa
@ 2020-09-10 13:48 ` Namhyung Kim
  2020-09-10 14:47   ` Jiri Olsa
  0 siblings, 1 reply; 13+ messages in thread
From: Namhyung Kim @ 2020-09-10 13:48 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Peter Zijlstra, Arnaldo Carvalho de Melo, lkml, Ingo Molnar,
	Alexander Shishkin, Michael Petlan, Wade Mealing

On Thu, Sep 10, 2020 at 7:42 PM Jiri Olsa <jolsa@kernel.org> wrote:
>
> There's a possible race in perf_mmap_close when checking ring buffer's
> mmap_count refcount value. The problem is that the mmap_count check is
> not atomic because we call atomic_dec and atomic_read separately.
>
>   perf_mmap_close:
>   ...
>    atomic_dec(&rb->mmap_count);
>    ...
>    if (atomic_read(&rb->mmap_count))
>       goto out_put;
>
>    <ring buffer detach>
>    free_uid
>
> out_put:
>   ring_buffer_put(rb); /* could be last */
>
> The race can happen when we have two (or more) events sharing same ring
> buffer and they go through atomic_dec and then they both see 0 as refcount
> value later in atomic_read. Then both will go on and execute code which
> is meant to be run just once.
>
> The code that detaches ring buffer is probably fine to be executed more
> than once, but the problem is in calling free_uid, which will later on
> demonstrate in related crashes and refcount warnings, like:
>
>   refcount_t: addition on 0; use-after-free.
>   ...
>   RIP: 0010:refcount_warn_saturate+0x6d/0xf
>   ...
>   Call Trace:
>   prepare_creds+0x190/0x1e0
>   copy_creds+0x35/0x172
>   copy_process+0x471/0x1a80
>   _do_fork+0x83/0x3a0
>   __do_sys_wait4+0x83/0x90
>   __do_sys_clone+0x85/0xa0
>   do_syscall_64+0x5b/0x1e0
>   entry_SYSCALL_64_after_hwframe+0x44/0xa9
>
> Using atomic decrease and check instead of separated calls.
> This fixes CVE-2020-14351.
>
> Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> ---
>  kernel/events/core.c | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
>
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 7ed5248f0445..29313cc54d9e 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -5903,8 +5903,6 @@ static void perf_mmap_close(struct vm_area_struct *vma)
>                 mutex_unlock(&event->mmap_mutex);
>         }
>
> -       atomic_dec(&rb->mmap_count);
> -
>         if (!atomic_dec_and_mutex_lock(&event->mmap_count, &event->mmap_mutex))
>                 goto out_put;

But when it takes the goto, rb->mmap_count won't decrement anymore..

Thanks
Namhyung

>
> @@ -5912,7 +5910,7 @@ static void perf_mmap_close(struct vm_area_struct *vma)
>         mutex_unlock(&event->mmap_mutex);
>
>         /* If there's still other mmap()s of this buffer, we're done. */
> -       if (atomic_read(&rb->mmap_count))
> +       if (!atomic_dec_and_test(&rb->mmap_count))
>                 goto out_put;
>
>         /*
> --
> 2.26.2
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] perf: Fix race in perf_mmap_close function
  2020-09-10 13:48 ` Namhyung Kim
@ 2020-09-10 14:47   ` Jiri Olsa
  2020-09-11  3:05     ` Namhyung Kim
  0 siblings, 1 reply; 13+ messages in thread
From: Jiri Olsa @ 2020-09-10 14:47 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Jiri Olsa, Peter Zijlstra, Arnaldo Carvalho de Melo, lkml,
	Ingo Molnar, Alexander Shishkin, Michael Petlan, Wade Mealing

On Thu, Sep 10, 2020 at 10:48:02PM +0900, Namhyung Kim wrote:

SNIP

> >   _do_fork+0x83/0x3a0
> >   __do_sys_wait4+0x83/0x90
> >   __do_sys_clone+0x85/0xa0
> >   do_syscall_64+0x5b/0x1e0
> >   entry_SYSCALL_64_after_hwframe+0x44/0xa9
> >
> > Using atomic decrease and check instead of separated calls.
> > This fixes CVE-2020-14351.
> >
> > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > ---
> >  kernel/events/core.c | 4 +---
> >  1 file changed, 1 insertion(+), 3 deletions(-)
> >
> > diff --git a/kernel/events/core.c b/kernel/events/core.c
> > index 7ed5248f0445..29313cc54d9e 100644
> > --- a/kernel/events/core.c
> > +++ b/kernel/events/core.c
> > @@ -5903,8 +5903,6 @@ static void perf_mmap_close(struct vm_area_struct *vma)
> >                 mutex_unlock(&event->mmap_mutex);
> >         }
> >
> > -       atomic_dec(&rb->mmap_count);
> > -
> >         if (!atomic_dec_and_mutex_lock(&event->mmap_count, &event->mmap_mutex))
> >                 goto out_put;
> 
> But when it takes the goto, rb->mmap_count won't decrement anymore..

event->mmap_count is per event, so if we have have race in here,
2 threads can go through with each event->mmap_count reaching zero

jirka


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] perf: Fix race in perf_mmap_close function
  2020-09-10 14:47   ` Jiri Olsa
@ 2020-09-11  3:05     ` Namhyung Kim
  2020-09-11  7:49       ` Jiri Olsa
  0 siblings, 1 reply; 13+ messages in thread
From: Namhyung Kim @ 2020-09-11  3:05 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Jiri Olsa, Peter Zijlstra, Arnaldo Carvalho de Melo, lkml,
	Ingo Molnar, Alexander Shishkin, Michael Petlan, Wade Mealing

Hi Jiri,

On Thu, Sep 10, 2020 at 11:50 PM Jiri Olsa <jolsa@redhat.com> wrote:
>
> On Thu, Sep 10, 2020 at 10:48:02PM +0900, Namhyung Kim wrote:
>
> SNIP
>
> > >   _do_fork+0x83/0x3a0
> > >   __do_sys_wait4+0x83/0x90
> > >   __do_sys_clone+0x85/0xa0
> > >   do_syscall_64+0x5b/0x1e0
> > >   entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > >
> > > Using atomic decrease and check instead of separated calls.
> > > This fixes CVE-2020-14351.
> > >
> > > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > > ---
> > >  kernel/events/core.c | 4 +---
> > >  1 file changed, 1 insertion(+), 3 deletions(-)
> > >
> > > diff --git a/kernel/events/core.c b/kernel/events/core.c
> > > index 7ed5248f0445..29313cc54d9e 100644
> > > --- a/kernel/events/core.c
> > > +++ b/kernel/events/core.c
> > > @@ -5903,8 +5903,6 @@ static void perf_mmap_close(struct vm_area_struct *vma)
> > >                 mutex_unlock(&event->mmap_mutex);
> > >         }
> > >
> > > -       atomic_dec(&rb->mmap_count);
> > > -
> > >         if (!atomic_dec_and_mutex_lock(&event->mmap_count, &event->mmap_mutex))
> > >                 goto out_put;
> >
> > But when it takes the goto, rb->mmap_count won't decrement anymore..
>
> event->mmap_count is per event, so if we have have race in here,
> 2 threads can go through with each event->mmap_count reaching zero

Maybe I'm missing something.

But as far as I can see, perf_mmap_close() always decremented both
rb->mmap_count and event->mmap_count.  But with this change,
it seems not decrement rb->mmap_count when event->mmap_count
doesn't go to zero, right?

Thanks
Namhyung

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] perf: Fix race in perf_mmap_close function
  2020-09-11  3:05     ` Namhyung Kim
@ 2020-09-11  7:49       ` Jiri Olsa
  2020-09-14 12:48         ` Namhyung Kim
  0 siblings, 1 reply; 13+ messages in thread
From: Jiri Olsa @ 2020-09-11  7:49 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Jiri Olsa, Peter Zijlstra, Arnaldo Carvalho de Melo, lkml,
	Ingo Molnar, Alexander Shishkin, Michael Petlan, Wade Mealing

On Fri, Sep 11, 2020 at 12:05:10PM +0900, Namhyung Kim wrote:
> Hi Jiri,
> 
> On Thu, Sep 10, 2020 at 11:50 PM Jiri Olsa <jolsa@redhat.com> wrote:
> >
> > On Thu, Sep 10, 2020 at 10:48:02PM +0900, Namhyung Kim wrote:
> >
> > SNIP
> >
> > > >   _do_fork+0x83/0x3a0
> > > >   __do_sys_wait4+0x83/0x90
> > > >   __do_sys_clone+0x85/0xa0
> > > >   do_syscall_64+0x5b/0x1e0
> > > >   entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > > >
> > > > Using atomic decrease and check instead of separated calls.
> > > > This fixes CVE-2020-14351.
> > > >
> > > > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > > > ---
> > > >  kernel/events/core.c | 4 +---
> > > >  1 file changed, 1 insertion(+), 3 deletions(-)
> > > >
> > > > diff --git a/kernel/events/core.c b/kernel/events/core.c
> > > > index 7ed5248f0445..29313cc54d9e 100644
> > > > --- a/kernel/events/core.c
> > > > +++ b/kernel/events/core.c
> > > > @@ -5903,8 +5903,6 @@ static void perf_mmap_close(struct vm_area_struct *vma)
> > > >                 mutex_unlock(&event->mmap_mutex);
> > > >         }
> > > >
> > > > -       atomic_dec(&rb->mmap_count);
> > > > -
> > > >         if (!atomic_dec_and_mutex_lock(&event->mmap_count, &event->mmap_mutex))
> > > >                 goto out_put;
> > >
> > > But when it takes the goto, rb->mmap_count won't decrement anymore..
> >
> > event->mmap_count is per event, so if we have have race in here,
> > 2 threads can go through with each event->mmap_count reaching zero
> 
> Maybe I'm missing something.
> 
> But as far as I can see, perf_mmap_close() always decremented both
> rb->mmap_count and event->mmap_count.  But with this change,
> it seems not decrement rb->mmap_count when event->mmap_count
> doesn't go to zero, right?

ugh, that's right.. how about change below

jirka


---
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 7ed5248f0445..8ab2400aef55 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -5868,11 +5868,11 @@ static void perf_pmu_output_stop(struct perf_event *event);
 static void perf_mmap_close(struct vm_area_struct *vma)
 {
 	struct perf_event *event = vma->vm_file->private_data;
-
 	struct perf_buffer *rb = ring_buffer_get(event);
 	struct user_struct *mmap_user = rb->mmap_user;
 	int mmap_locked = rb->mmap_locked;
 	unsigned long size = perf_data_size(rb);
+	bool detach_rest = false;
 
 	if (event->pmu->event_unmapped)
 		event->pmu->event_unmapped(event, vma->vm_mm);
@@ -5903,7 +5903,8 @@ static void perf_mmap_close(struct vm_area_struct *vma)
 		mutex_unlock(&event->mmap_mutex);
 	}
 
-	atomic_dec(&rb->mmap_count);
+	if (atomic_dec_and_test(&rb->mmap_count))
+		detach_rest = true;
 
 	if (!atomic_dec_and_mutex_lock(&event->mmap_count, &event->mmap_mutex))
 		goto out_put;
@@ -5912,7 +5913,7 @@ static void perf_mmap_close(struct vm_area_struct *vma)
 	mutex_unlock(&event->mmap_mutex);
 
 	/* If there's still other mmap()s of this buffer, we're done. */
-	if (atomic_read(&rb->mmap_count))
+	if (!detach_rest)
 		goto out_put;
 
 	/*


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] perf: Fix race in perf_mmap_close function
  2020-09-11  7:49       ` Jiri Olsa
@ 2020-09-14 12:48         ` Namhyung Kim
  2020-09-14 20:59           ` Jiri Olsa
  0 siblings, 1 reply; 13+ messages in thread
From: Namhyung Kim @ 2020-09-14 12:48 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Jiri Olsa, Peter Zijlstra, Arnaldo Carvalho de Melo, lkml,
	Ingo Molnar, Alexander Shishkin, Michael Petlan, Wade Mealing

On Fri, Sep 11, 2020 at 4:49 PM Jiri Olsa <jolsa@redhat.com> wrote:
> ugh, that's right.. how about change below

Acked-by: Namhyung Kim <namhyung@kernel.org>

Thanks
Namhyung


>
> jirka
>
>
> ---
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 7ed5248f0445..8ab2400aef55 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -5868,11 +5868,11 @@ static void perf_pmu_output_stop(struct perf_event *event);
>  static void perf_mmap_close(struct vm_area_struct *vma)
>  {
>         struct perf_event *event = vma->vm_file->private_data;
> -
>         struct perf_buffer *rb = ring_buffer_get(event);
>         struct user_struct *mmap_user = rb->mmap_user;
>         int mmap_locked = rb->mmap_locked;
>         unsigned long size = perf_data_size(rb);
> +       bool detach_rest = false;
>
>         if (event->pmu->event_unmapped)
>                 event->pmu->event_unmapped(event, vma->vm_mm);
> @@ -5903,7 +5903,8 @@ static void perf_mmap_close(struct vm_area_struct *vma)
>                 mutex_unlock(&event->mmap_mutex);
>         }
>
> -       atomic_dec(&rb->mmap_count);
> +       if (atomic_dec_and_test(&rb->mmap_count))
> +               detach_rest = true;
>
>         if (!atomic_dec_and_mutex_lock(&event->mmap_count, &event->mmap_mutex))
>                 goto out_put;
> @@ -5912,7 +5913,7 @@ static void perf_mmap_close(struct vm_area_struct *vma)
>         mutex_unlock(&event->mmap_mutex);
>
>         /* If there's still other mmap()s of this buffer, we're done. */
> -       if (atomic_read(&rb->mmap_count))
> +       if (!detach_rest)
>                 goto out_put;
>
>         /*
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] perf: Fix race in perf_mmap_close function
  2020-09-14 12:48         ` Namhyung Kim
@ 2020-09-14 20:59           ` Jiri Olsa
  2020-09-15 15:35             ` Michael Petlan
  0 siblings, 1 reply; 13+ messages in thread
From: Jiri Olsa @ 2020-09-14 20:59 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Jiri Olsa, Peter Zijlstra, Arnaldo Carvalho de Melo, lkml,
	Ingo Molnar, Alexander Shishkin, Michael Petlan, Wade Mealing

On Mon, Sep 14, 2020 at 09:48:43PM +0900, Namhyung Kim wrote:
> On Fri, Sep 11, 2020 at 4:49 PM Jiri Olsa <jolsa@redhat.com> wrote:
> > ugh, that's right.. how about change below
> 
> Acked-by: Namhyung Kim <namhyung@kernel.org>

thanks, I'll send full patch after we're done testing this

jirka

> 
> Thanks
> Namhyung
> 
> 
> >
> > jirka
> >
> >
> > ---
> > diff --git a/kernel/events/core.c b/kernel/events/core.c
> > index 7ed5248f0445..8ab2400aef55 100644
> > --- a/kernel/events/core.c
> > +++ b/kernel/events/core.c
> > @@ -5868,11 +5868,11 @@ static void perf_pmu_output_stop(struct perf_event *event);
> >  static void perf_mmap_close(struct vm_area_struct *vma)
> >  {
> >         struct perf_event *event = vma->vm_file->private_data;
> > -
> >         struct perf_buffer *rb = ring_buffer_get(event);
> >         struct user_struct *mmap_user = rb->mmap_user;
> >         int mmap_locked = rb->mmap_locked;
> >         unsigned long size = perf_data_size(rb);
> > +       bool detach_rest = false;
> >
> >         if (event->pmu->event_unmapped)
> >                 event->pmu->event_unmapped(event, vma->vm_mm);
> > @@ -5903,7 +5903,8 @@ static void perf_mmap_close(struct vm_area_struct *vma)
> >                 mutex_unlock(&event->mmap_mutex);
> >         }
> >
> > -       atomic_dec(&rb->mmap_count);
> > +       if (atomic_dec_and_test(&rb->mmap_count))
> > +               detach_rest = true;
> >
> >         if (!atomic_dec_and_mutex_lock(&event->mmap_count, &event->mmap_mutex))
> >                 goto out_put;
> > @@ -5912,7 +5913,7 @@ static void perf_mmap_close(struct vm_area_struct *vma)
> >         mutex_unlock(&event->mmap_mutex);
> >
> >         /* If there's still other mmap()s of this buffer, we're done. */
> > -       if (atomic_read(&rb->mmap_count))
> > +       if (!detach_rest)
> >                 goto out_put;
> >
> >         /*
> >
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] perf: Fix race in perf_mmap_close function
  2020-09-14 20:59           ` Jiri Olsa
@ 2020-09-15 15:35             ` Michael Petlan
  2020-09-16 11:53               ` [PATCHv2] " Jiri Olsa
  0 siblings, 1 reply; 13+ messages in thread
From: Michael Petlan @ 2020-09-15 15:35 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Namhyung Kim, Jiri Olsa, Peter Zijlstra,
	Arnaldo Carvalho de Melo, lkml, Ingo Molnar, Alexander Shishkin,
	Wade Mealing

On Mon, 14 Sep 2020, Jiri Olsa wrote:
> On Mon, Sep 14, 2020 at 09:48:43PM +0900, Namhyung Kim wrote:
> > On Fri, Sep 11, 2020 at 4:49 PM Jiri Olsa <jolsa@redhat.com> wrote:
> > > ugh, that's right.. how about change below
> > 
> > Acked-by: Namhyung Kim <namhyung@kernel.org>
> 
> thanks, I'll send full patch after we're done testing this
> 
> jirka

I've tested the change and seems OK to me.

Tested-by: Michael Petlan <mpetlan@redhat.com>

> 
> > 
> > Thanks
> > Namhyung
> > 
> > 
> > >
> > > jirka
> > >
> > >
> > > ---
> > > diff --git a/kernel/events/core.c b/kernel/events/core.c
> > > index 7ed5248f0445..8ab2400aef55 100644
> > > --- a/kernel/events/core.c
> > > +++ b/kernel/events/core.c
> > > @@ -5868,11 +5868,11 @@ static void perf_pmu_output_stop(struct perf_event *event);
> > >  static void perf_mmap_close(struct vm_area_struct *vma)
> > >  {
> > >         struct perf_event *event = vma->vm_file->private_data;
> > > -
> > >         struct perf_buffer *rb = ring_buffer_get(event);
> > >         struct user_struct *mmap_user = rb->mmap_user;
> > >         int mmap_locked = rb->mmap_locked;
> > >         unsigned long size = perf_data_size(rb);
> > > +       bool detach_rest = false;
> > >
> > >         if (event->pmu->event_unmapped)
> > >                 event->pmu->event_unmapped(event, vma->vm_mm);
> > > @@ -5903,7 +5903,8 @@ static void perf_mmap_close(struct vm_area_struct *vma)
> > >                 mutex_unlock(&event->mmap_mutex);
> > >         }
> > >
> > > -       atomic_dec(&rb->mmap_count);
> > > +       if (atomic_dec_and_test(&rb->mmap_count))
> > > +               detach_rest = true;
> > >
> > >         if (!atomic_dec_and_mutex_lock(&event->mmap_count, &event->mmap_mutex))
> > >                 goto out_put;
> > > @@ -5912,7 +5913,7 @@ static void perf_mmap_close(struct vm_area_struct *vma)
> > >         mutex_unlock(&event->mmap_mutex);
> > >
> > >         /* If there's still other mmap()s of this buffer, we're done. */
> > > -       if (atomic_read(&rb->mmap_count))
> > > +       if (!detach_rest)
> > >                 goto out_put;
> > >
> > >         /*
> > >
> > 
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCHv2] perf: Fix race in perf_mmap_close function
  2020-09-15 15:35             ` Michael Petlan
@ 2020-09-16 11:53               ` Jiri Olsa
  2020-09-16 13:54                 ` peterz
                                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Jiri Olsa @ 2020-09-16 11:53 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Namhyung Kim, Jiri Olsa, Arnaldo Carvalho de Melo, lkml,
	Ingo Molnar, Alexander Shishkin, Wade Mealing

There's a possible race in perf_mmap_close when checking ring buffer's
mmap_count refcount value. The problem is that the mmap_count check is
not atomic because we call atomic_dec and atomic_read separately.

  perf_mmap_close:
  ...
   atomic_dec(&rb->mmap_count);
   ...
   if (atomic_read(&rb->mmap_count))
      goto out_put;

   <ring buffer detach>
   free_uid

out_put:
  ring_buffer_put(rb); /* could be last */

The race can happen when we have two (or more) events sharing same ring
buffer and they go through atomic_dec and then they both see 0 as refcount
value later in atomic_read. Then both will go on and execute code which
is meant to be run just once.

The code that detaches ring buffer is probably fine to be executed more
than once, but the problem is in calling free_uid, which will later on
demonstrate in related crashes and refcount warnings, like:

  refcount_t: addition on 0; use-after-free.
  ...
  RIP: 0010:refcount_warn_saturate+0x6d/0xf
  ...
  Call Trace:
  prepare_creds+0x190/0x1e0
  copy_creds+0x35/0x172
  copy_process+0x471/0x1a80
  _do_fork+0x83/0x3a0
  __do_sys_wait4+0x83/0x90
  __do_sys_clone+0x85/0xa0
  do_syscall_64+0x5b/0x1e0
  entry_SYSCALL_64_after_hwframe+0x44/0xa9

Using atomic decrease and check instead of separated calls.
This fixes CVE-2020-14351.

Acked-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Michael Petlan <mpetlan@redhat.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 kernel/events/core.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 7ed5248f0445..8ab2400aef55 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -5868,11 +5868,11 @@ static void perf_pmu_output_stop(struct perf_event *event);
 static void perf_mmap_close(struct vm_area_struct *vma)
 {
 	struct perf_event *event = vma->vm_file->private_data;
-
 	struct perf_buffer *rb = ring_buffer_get(event);
 	struct user_struct *mmap_user = rb->mmap_user;
 	int mmap_locked = rb->mmap_locked;
 	unsigned long size = perf_data_size(rb);
+	bool detach_rest = false;
 
 	if (event->pmu->event_unmapped)
 		event->pmu->event_unmapped(event, vma->vm_mm);
@@ -5903,7 +5903,8 @@ static void perf_mmap_close(struct vm_area_struct *vma)
 		mutex_unlock(&event->mmap_mutex);
 	}
 
-	atomic_dec(&rb->mmap_count);
+	if (atomic_dec_and_test(&rb->mmap_count))
+		detach_rest = true;
 
 	if (!atomic_dec_and_mutex_lock(&event->mmap_count, &event->mmap_mutex))
 		goto out_put;
@@ -5912,7 +5913,7 @@ static void perf_mmap_close(struct vm_area_struct *vma)
 	mutex_unlock(&event->mmap_mutex);
 
 	/* If there's still other mmap()s of this buffer, we're done. */
-	if (atomic_read(&rb->mmap_count))
+	if (!detach_rest)
 		goto out_put;
 
 	/*
-- 
2.26.2


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCHv2] perf: Fix race in perf_mmap_close function
  2020-09-16 11:53               ` [PATCHv2] " Jiri Olsa
@ 2020-09-16 13:54                 ` peterz
  2020-09-16 14:38                   ` Jiri Olsa
  2020-09-16 14:05                 ` peterz
  2020-10-12 11:45                 ` [tip: perf/core] perf/core: Fix race in the perf_mmap_close() function tip-bot2 for Jiri Olsa
  2 siblings, 1 reply; 13+ messages in thread
From: peterz @ 2020-09-16 13:54 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Namhyung Kim, Jiri Olsa, Arnaldo Carvalho de Melo, lkml,
	Ingo Molnar, Alexander Shishkin, Wade Mealing

On Wed, Sep 16, 2020 at 01:53:11PM +0200, Jiri Olsa wrote:
> There's a possible race in perf_mmap_close when checking ring buffer's
> mmap_count refcount value. The problem is that the mmap_count check is
> not atomic because we call atomic_dec and atomic_read separately.
> 
>   perf_mmap_close:
>   ...
>    atomic_dec(&rb->mmap_count);
>    ...
>    if (atomic_read(&rb->mmap_count))
>       goto out_put;
> 
>    <ring buffer detach>
>    free_uid
> 
> out_put:
>   ring_buffer_put(rb); /* could be last */
> 
> The race can happen when we have two (or more) events sharing same ring
> buffer and they go through atomic_dec and then they both see 0 as refcount
> value later in atomic_read. Then both will go on and execute code which
> is meant to be run just once.
> 
> The code that detaches ring buffer is probably fine to be executed more
> than once, but the problem is in calling free_uid, which will later on
> demonstrate in related crashes and refcount warnings, like:
> 
>   refcount_t: addition on 0; use-after-free.
>   ...
>   RIP: 0010:refcount_warn_saturate+0x6d/0xf
>   ...
>   Call Trace:
>   prepare_creds+0x190/0x1e0
>   copy_creds+0x35/0x172
>   copy_process+0x471/0x1a80
>   _do_fork+0x83/0x3a0
>   __do_sys_wait4+0x83/0x90
>   __do_sys_clone+0x85/0xa0
>   do_syscall_64+0x5b/0x1e0
>   entry_SYSCALL_64_after_hwframe+0x44/0xa9
> 
> Using atomic decrease and check instead of separated calls.
> This fixes CVE-2020-14351.

I'm tempted to remove that line; I can never seem to find anything
useful in a CVE.

Fixes: ?

> Acked-by: Namhyung Kim <namhyung@kernel.org>
> Tested-by: Michael Petlan <mpetlan@redhat.com>
> Signed-off-by: Jiri Olsa <jolsa@kernel.org>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCHv2] perf: Fix race in perf_mmap_close function
  2020-09-16 11:53               ` [PATCHv2] " Jiri Olsa
  2020-09-16 13:54                 ` peterz
@ 2020-09-16 14:05                 ` peterz
  2020-10-12 11:45                 ` [tip: perf/core] perf/core: Fix race in the perf_mmap_close() function tip-bot2 for Jiri Olsa
  2 siblings, 0 replies; 13+ messages in thread
From: peterz @ 2020-09-16 14:05 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Namhyung Kim, Jiri Olsa, Arnaldo Carvalho de Melo, lkml,
	Ingo Molnar, Alexander Shishkin, Wade Mealing

On Wed, Sep 16, 2020 at 01:53:11PM +0200, Jiri Olsa wrote:
> There's a possible race in perf_mmap_close when checking ring buffer's
> mmap_count refcount value. The problem is that the mmap_count check is
> not atomic because we call atomic_dec and atomic_read separately.
> 
>   perf_mmap_close:
>   ...
>    atomic_dec(&rb->mmap_count);
>    ...
>    if (atomic_read(&rb->mmap_count))
>       goto out_put;
> 
>    <ring buffer detach>
>    free_uid
> 
> out_put:
>   ring_buffer_put(rb); /* could be last */
> 
> The race can happen when we have two (or more) events sharing same ring
> buffer and they go through atomic_dec and then they both see 0 as refcount
> value later in atomic_read. Then both will go on and execute code which
> is meant to be run just once.

The trival case should be protected by mmap_sem, we call vm_ops->close()
with mmap_sem held for writing IIRC. But yes, I think it's possible to
construct a failure here.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCHv2] perf: Fix race in perf_mmap_close function
  2020-09-16 13:54                 ` peterz
@ 2020-09-16 14:38                   ` Jiri Olsa
  0 siblings, 0 replies; 13+ messages in thread
From: Jiri Olsa @ 2020-09-16 14:38 UTC (permalink / raw)
  To: peterz
  Cc: Namhyung Kim, Jiri Olsa, Arnaldo Carvalho de Melo, lkml,
	Ingo Molnar, Alexander Shishkin, Wade Mealing

On Wed, Sep 16, 2020 at 03:54:02PM +0200, peterz@infradead.org wrote:
> On Wed, Sep 16, 2020 at 01:53:11PM +0200, Jiri Olsa wrote:
> > There's a possible race in perf_mmap_close when checking ring buffer's
> > mmap_count refcount value. The problem is that the mmap_count check is
> > not atomic because we call atomic_dec and atomic_read separately.
> > 
> >   perf_mmap_close:
> >   ...
> >    atomic_dec(&rb->mmap_count);
> >    ...
> >    if (atomic_read(&rb->mmap_count))
> >       goto out_put;
> > 
> >    <ring buffer detach>
> >    free_uid
> > 
> > out_put:
> >   ring_buffer_put(rb); /* could be last */
> > 
> > The race can happen when we have two (or more) events sharing same ring
> > buffer and they go through atomic_dec and then they both see 0 as refcount
> > value later in atomic_read. Then both will go on and execute code which
> > is meant to be run just once.
> > 
> > The code that detaches ring buffer is probably fine to be executed more
> > than once, but the problem is in calling free_uid, which will later on
> > demonstrate in related crashes and refcount warnings, like:
> > 
> >   refcount_t: addition on 0; use-after-free.
> >   ...
> >   RIP: 0010:refcount_warn_saturate+0x6d/0xf
> >   ...
> >   Call Trace:
> >   prepare_creds+0x190/0x1e0
> >   copy_creds+0x35/0x172
> >   copy_process+0x471/0x1a80
> >   _do_fork+0x83/0x3a0
> >   __do_sys_wait4+0x83/0x90
> >   __do_sys_clone+0x85/0xa0
> >   do_syscall_64+0x5b/0x1e0
> >   entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > 
> > Using atomic decrease and check instead of separated calls.
> > This fixes CVE-2020-14351.
> 
> I'm tempted to remove that line; I can never seem to find anything
> useful in a CVE.

I was asked by security guys to add this, Wade?

> 
> Fixes: ?

right, sry..

Fixes: 9bb5d40cd93c ("perf: Fix mmap() accounting hole");

thanks,
jirka

> 
> > Acked-by: Namhyung Kim <namhyung@kernel.org>
> > Tested-by: Michael Petlan <mpetlan@redhat.com>
> > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [tip: perf/core] perf/core: Fix race in the perf_mmap_close() function
  2020-09-16 11:53               ` [PATCHv2] " Jiri Olsa
  2020-09-16 13:54                 ` peterz
  2020-09-16 14:05                 ` peterz
@ 2020-10-12 11:45                 ` tip-bot2 for Jiri Olsa
  2 siblings, 0 replies; 13+ messages in thread
From: tip-bot2 for Jiri Olsa @ 2020-10-12 11:45 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Michael Petlan, Jiri Olsa, Ingo Molnar, Peter Zijlstra,
	Namhyung Kim, Wade Mealing, x86, LKML

The following commit has been merged into the perf/core branch of tip:

Commit-ID:     f91072ed1b7283b13ca57fcfbece5a3b92726143
Gitweb:        https://git.kernel.org/tip/f91072ed1b7283b13ca57fcfbece5a3b92726143
Author:        Jiri Olsa <jolsa@redhat.com>
AuthorDate:    Wed, 16 Sep 2020 13:53:11 +02:00
Committer:     Ingo Molnar <mingo@kernel.org>
CommitterDate: Mon, 12 Oct 2020 13:24:26 +02:00

perf/core: Fix race in the perf_mmap_close() function

There's a possible race in perf_mmap_close() when checking ring buffer's
mmap_count refcount value. The problem is that the mmap_count check is
not atomic because we call atomic_dec() and atomic_read() separately.

  perf_mmap_close:
  ...
   atomic_dec(&rb->mmap_count);
   ...
   if (atomic_read(&rb->mmap_count))
      goto out_put;

   <ring buffer detach>
   free_uid

out_put:
  ring_buffer_put(rb); /* could be last */

The race can happen when we have two (or more) events sharing same ring
buffer and they go through atomic_dec() and then they both see 0 as refcount
value later in atomic_read(). Then both will go on and execute code which
is meant to be run just once.

The code that detaches ring buffer is probably fine to be executed more
than once, but the problem is in calling free_uid(), which will later on
demonstrate in related crashes and refcount warnings, like:

  refcount_t: addition on 0; use-after-free.
  ...
  RIP: 0010:refcount_warn_saturate+0x6d/0xf
  ...
  Call Trace:
  prepare_creds+0x190/0x1e0
  copy_creds+0x35/0x172
  copy_process+0x471/0x1a80
  _do_fork+0x83/0x3a0
  __do_sys_wait4+0x83/0x90
  __do_sys_clone+0x85/0xa0
  do_syscall_64+0x5b/0x1e0
  entry_SYSCALL_64_after_hwframe+0x44/0xa9

Using atomic decrease and check instead of separated calls.

Tested-by: Michael Petlan <mpetlan@redhat.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Wade Mealing <wmealing@redhat.com>
Fixes: 9bb5d40cd93c ("perf: Fix mmap() accounting hole");
Link: https://lore.kernel.org/r/20200916115311.GE2301783@krava
---
 kernel/events/core.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 45edb85..fb662eb 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -5895,11 +5895,11 @@ static void perf_pmu_output_stop(struct perf_event *event);
 static void perf_mmap_close(struct vm_area_struct *vma)
 {
 	struct perf_event *event = vma->vm_file->private_data;
-
 	struct perf_buffer *rb = ring_buffer_get(event);
 	struct user_struct *mmap_user = rb->mmap_user;
 	int mmap_locked = rb->mmap_locked;
 	unsigned long size = perf_data_size(rb);
+	bool detach_rest = false;
 
 	if (event->pmu->event_unmapped)
 		event->pmu->event_unmapped(event, vma->vm_mm);
@@ -5930,7 +5930,8 @@ static void perf_mmap_close(struct vm_area_struct *vma)
 		mutex_unlock(&event->mmap_mutex);
 	}
 
-	atomic_dec(&rb->mmap_count);
+	if (atomic_dec_and_test(&rb->mmap_count))
+		detach_rest = true;
 
 	if (!atomic_dec_and_mutex_lock(&event->mmap_count, &event->mmap_mutex))
 		goto out_put;
@@ -5939,7 +5940,7 @@ static void perf_mmap_close(struct vm_area_struct *vma)
 	mutex_unlock(&event->mmap_mutex);
 
 	/* If there's still other mmap()s of this buffer, we're done. */
-	if (atomic_read(&rb->mmap_count))
+	if (!detach_rest)
 		goto out_put;
 
 	/*

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, back to index

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-10 10:41 [PATCH] perf: Fix race in perf_mmap_close function Jiri Olsa
2020-09-10 13:48 ` Namhyung Kim
2020-09-10 14:47   ` Jiri Olsa
2020-09-11  3:05     ` Namhyung Kim
2020-09-11  7:49       ` Jiri Olsa
2020-09-14 12:48         ` Namhyung Kim
2020-09-14 20:59           ` Jiri Olsa
2020-09-15 15:35             ` Michael Petlan
2020-09-16 11:53               ` [PATCHv2] " Jiri Olsa
2020-09-16 13:54                 ` peterz
2020-09-16 14:38                   ` Jiri Olsa
2020-09-16 14:05                 ` peterz
2020-10-12 11:45                 ` [tip: perf/core] perf/core: Fix race in the perf_mmap_close() function tip-bot2 for Jiri Olsa

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git
	git clone --mirror https://lore.kernel.org/lkml/8 lkml/git/8.git
	git clone --mirror https://lore.kernel.org/lkml/9 lkml/git/9.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org
	public-inbox-index lkml

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git