linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Coccinelle rule for CVE-2019-18683
@ 2020-04-08 22:01 Alexander Popov
  2020-04-08 22:26 ` Jann Horn
  2020-04-09 10:53 ` [Cocci] " Julia Lawall
  0 siblings, 2 replies; 8+ messages in thread
From: Alexander Popov @ 2020-04-08 22:01 UTC (permalink / raw)
  To: Julia Lawall, Gilles Muller, Nicolas Palix, Michal Marek, cocci,
	kernel-hardening, Jann Horn, Kees Cook, Hans Verkuil,
	Mauro Carvalho Chehab, Linux Media Mailing List, LKML

Hello!

Some time ago I fixed CVE-2019-18683 in the V4L2 subsystem of the Linux kernel.

I created a Coccinelle rule that detects that bug pattern. Let me show it.


Bug pattern
===========

CVE-2019-18683 refers to three similar vulnerabilities caused by the same
incorrect approach to locking that is used in vivid_stop_generating_vid_cap(),
vivid_stop_generating_vid_out(), and sdr_cap_stop_streaming().

For fixes please see the commit 6dcd5d7a7a29c1e4 (media: vivid: Fix wrong
locking that causes race conditions on streaming stop).

These three functions are called during streaming stopping with vivid_dev.mutex
locked. And they all do the same mistake while stopping their kthreads, which
need to lock this mutex as well. See the example from
vivid_stop_generating_vid_cap():
    /* shutdown control thread */
    vivid_grab_controls(dev, false);
    mutex_unlock(&dev->mutex);
    kthread_stop(dev->kthread_vid_cap);
    dev->kthread_vid_cap = NULL;
    mutex_lock(&dev->mutex);

But when this mutex is unlocked, another vb2_fop_read() can lock it instead of
the kthread and manipulate the buffer queue. That causes use-after-free.

I created a Coccinelle rule that detects mutex_unlock+kthread_stop+mutex_lock
within one function.


Coccinelle rule
===============

virtual report

@race exists@
expression E;
position stop_p;
position unlock_p;
position lock_p;
@@

mutex_unlock@unlock_p(E)
...
kthread_stop@stop_p(...)
...
mutex_lock@lock_p(E)

@script:python@
stop_p << race.stop_p;
unlock_p << race.unlock_p;
lock_p << race.lock_p;
E << race.E;
@@

coccilib.report.print_report(unlock_p[0], 'mutex_unlock(' + E + ') here')
coccilib.report.print_report(stop_p[0], 'kthread_stop here')
coccilib.report.print_report(lock_p[0], 'mutex_lock(' + E + ') here\n')


Testing the rule
================

I reverted the commit 6dcd5d7a7a29c1e4 and called:
COCCI=./scripts/coccinelle/kthread_race.cocci make coccicheck MODE=report

The result:

./drivers/media/platform/vivid/vivid-kthread-out.c:347:1-13: mutex_unlock(& dev
-> mutex) here
./drivers/media/platform/vivid/vivid-kthread-out.c:348:1-13: kthread_stop here
./drivers/media/platform/vivid/vivid-kthread-out.c:350:1-11: mutex_lock(& dev ->
mutex) here

./drivers/media/platform/vivid/vivid-sdr-cap.c:306:1-13: mutex_unlock(& dev ->
mutex) here
./drivers/media/platform/vivid/vivid-sdr-cap.c:307:1-13: kthread_stop here
./drivers/media/platform/vivid/vivid-sdr-cap.c:309:1-11: mutex_lock(& dev ->
mutex) here

./drivers/media/platform/vivid/vivid-kthread-cap.c:1001:1-13: mutex_unlock(& dev
-> mutex) here
./drivers/media/platform/vivid/vivid-kthread-cap.c:1002:1-13: kthread_stop here
./drivers/media/platform/vivid/vivid-kthread-cap.c:1004:1-11: mutex_lock(& dev
-> mutex) here

There are no other bugs detected.

Do you have any idea how to improve it?
Do we need that rule for regression testing in the upstream?

Thanks in advance!
Alexander

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Coccinelle rule for CVE-2019-18683
  2020-04-08 22:01 Coccinelle rule for CVE-2019-18683 Alexander Popov
@ 2020-04-08 22:26 ` Jann Horn
  2020-04-09 19:41   ` Alexander Popov
  2020-04-09 10:53 ` [Cocci] " Julia Lawall
  1 sibling, 1 reply; 8+ messages in thread
From: Jann Horn @ 2020-04-08 22:26 UTC (permalink / raw)
  To: Alexander Popov
  Cc: Julia Lawall, Gilles Muller, Nicolas Palix, Michal Marek, cocci,
	kernel-hardening, Kees Cook, Hans Verkuil, Mauro Carvalho Chehab,
	Linux Media Mailing List, LKML

On Thu, Apr 9, 2020 at 12:01 AM Alexander Popov <alex.popov@linux.com> wrote:
> CVE-2019-18683 refers to three similar vulnerabilities caused by the same
> incorrect approach to locking that is used in vivid_stop_generating_vid_cap(),
> vivid_stop_generating_vid_out(), and sdr_cap_stop_streaming().
>
> For fixes please see the commit 6dcd5d7a7a29c1e4 (media: vivid: Fix wrong
> locking that causes race conditions on streaming stop).
>
> These three functions are called during streaming stopping with vivid_dev.mutex
> locked. And they all do the same mistake while stopping their kthreads, which
> need to lock this mutex as well. See the example from
> vivid_stop_generating_vid_cap():
>     /* shutdown control thread */
>     vivid_grab_controls(dev, false);
>     mutex_unlock(&dev->mutex);
>     kthread_stop(dev->kthread_vid_cap);
>     dev->kthread_vid_cap = NULL;
>     mutex_lock(&dev->mutex);
>
> But when this mutex is unlocked, another vb2_fop_read() can lock it instead of
> the kthread and manipulate the buffer queue. That causes use-after-free.
>
> I created a Coccinelle rule that detects mutex_unlock+kthread_stop+mutex_lock
> within one function.
[...]
> mutex_unlock@unlock_p(E)
> ...
> kthread_stop@stop_p(...)
> ...
> mutex_lock@lock_p(E)

Is the kthread_stop() really special here? It seems to me like it's
pretty much just a normal instance of the "temporarily dropping a
lock" pattern - which does tend to go wrong quite often, but can also
be correct.

I think it would be interesting though to have a list of places that
drop and then re-acquire a mutex/spinlock/... that was not originally
acquired in the same block of code (but was instead originally
acquired in an outer block, or by a parent function, or something like
that). So things like this:

void X(...) {
  mutex_lock(A);
  for (...) {
    ...
    mutex_unlock(A);
    ...
    mutex_lock(A);
    ...
  }
  mutex_unlock(A);
}

or like this:

void X(...) {
  ... [no mutex operations on A]
  mutex_unlock(A);
  ...
  mutex_lock(A);
  ...
}


But of course, there are places where this kind of behavior is
correct; so such a script wouldn't just return report code, just code
that could use a bit more scrutiny than normal. For example, in
madvise_remove(), the mmap_sem is dropped and then re-acquired, which
is fine because the caller deals with that possibility properly:

static long madvise_remove(struct vm_area_struct *vma,
                                struct vm_area_struct **prev,
                                unsigned long start, unsigned long end)
{
        loff_t offset;
        int error;
        struct file *f;

        *prev = NULL;   /* tell sys_madvise we drop mmap_sem */

        if (vma->vm_flags & VM_LOCKED)
                return -EINVAL;

        f = vma->vm_file;

        if (!f || !f->f_mapping || !f->f_mapping->host) {
                        return -EINVAL;
        }

        if ((vma->vm_flags & (VM_SHARED|VM_WRITE)) != (VM_SHARED|VM_WRITE))
                return -EACCES;

        offset = (loff_t)(start - vma->vm_start)
                        + ((loff_t)vma->vm_pgoff << PAGE_SHIFT);

        /*
         * Filesystem's fallocate may need to take i_mutex.  We need to
         * explicitly grab a reference because the vma (and hence the
         * vma's reference to the file) can go away as soon as we drop
         * mmap_sem.
         */
        get_file(f);
        if (userfaultfd_remove(vma, start, end)) {
                /* mmap_sem was not released by userfaultfd_remove() */
                up_read(&current->mm->mmap_sem);
        }
        error = vfs_fallocate(f,
                                FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE,
                                offset, end - start);
        fput(f);
        down_read(&current->mm->mmap_sem);
        return error;
}

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Cocci] Coccinelle rule for CVE-2019-18683
  2020-04-08 22:01 Coccinelle rule for CVE-2019-18683 Alexander Popov
  2020-04-08 22:26 ` Jann Horn
@ 2020-04-09 10:53 ` Julia Lawall
  2020-04-09 19:56   ` Alexander Popov
  1 sibling, 1 reply; 8+ messages in thread
From: Julia Lawall @ 2020-04-09 10:53 UTC (permalink / raw)
  To: Alexander Popov
  Cc: Gilles Muller, Nicolas Palix, Michal Marek, cocci,
	kernel-hardening, Jann Horn, Kees Cook, Hans Verkuil,
	Mauro Carvalho Chehab, Linux Media Mailing List, LKML, jannh



On Thu, 9 Apr 2020, Alexander Popov wrote:

> Hello!
>
> Some time ago I fixed CVE-2019-18683 in the V4L2 subsystem of the Linux kernel.
>
> I created a Coccinelle rule that detects that bug pattern. Let me show it.

Thanks for the discussion :)

>
>
> Bug pattern
> ===========
>
> CVE-2019-18683 refers to three similar vulnerabilities caused by the same
> incorrect approach to locking that is used in vivid_stop_generating_vid_cap(),
> vivid_stop_generating_vid_out(), and sdr_cap_stop_streaming().
>
> For fixes please see the commit 6dcd5d7a7a29c1e4 (media: vivid: Fix wrong
> locking that causes race conditions on streaming stop).
>
> These three functions are called during streaming stopping with vivid_dev.mutex
> locked. And they all do the same mistake while stopping their kthreads, which
> need to lock this mutex as well. See the example from
> vivid_stop_generating_vid_cap():
>     /* shutdown control thread */
>     vivid_grab_controls(dev, false);
>     mutex_unlock(&dev->mutex);
>     kthread_stop(dev->kthread_vid_cap);
>     dev->kthread_vid_cap = NULL;
>     mutex_lock(&dev->mutex);
>
> But when this mutex is unlocked, another vb2_fop_read() can lock it instead of
> the kthread and manipulate the buffer queue. That causes use-after-free.
>
> I created a Coccinelle rule that detects mutex_unlock+kthread_stop+mutex_lock
> within one function.
>
>
> Coccinelle rule
> ===============
>
> virtual report
>
> @race exists@
> expression E;
> position stop_p;
> position unlock_p;
> position lock_p;
> @@
>
> mutex_unlock@unlock_p(E)
> ...

It would be good to put when != mutex_lock(E) after the ... above.  Your
rule doesn't actually prevent the lock from being retaken.

> kthread_stop@stop_p(...)
> ...
> mutex_lock@lock_p(E)
>
> @script:python@
> stop_p << race.stop_p;
> unlock_p << race.unlock_p;
> lock_p << race.lock_p;
> E << race.E;
> @@
>
> coccilib.report.print_report(unlock_p[0], 'mutex_unlock(' + E + ') here')
> coccilib.report.print_report(stop_p[0], 'kthread_stop here')
> coccilib.report.print_report(lock_p[0], 'mutex_lock(' + E + ') here\n')
>
>
> Testing the rule
> ================
>
> I reverted the commit 6dcd5d7a7a29c1e4 and called:
> COCCI=./scripts/coccinelle/kthread_race.cocci make coccicheck MODE=report
>
> The result:
>
> ./drivers/media/platform/vivid/vivid-kthread-out.c:347:1-13: mutex_unlock(& dev
> -> mutex) here
> ./drivers/media/platform/vivid/vivid-kthread-out.c:348:1-13: kthread_stop here
> ./drivers/media/platform/vivid/vivid-kthread-out.c:350:1-11: mutex_lock(& dev ->
> mutex) here
>
> ./drivers/media/platform/vivid/vivid-sdr-cap.c:306:1-13: mutex_unlock(& dev ->
> mutex) here
> ./drivers/media/platform/vivid/vivid-sdr-cap.c:307:1-13: kthread_stop here
> ./drivers/media/platform/vivid/vivid-sdr-cap.c:309:1-11: mutex_lock(& dev ->
> mutex) here
>
> ./drivers/media/platform/vivid/vivid-kthread-cap.c:1001:1-13: mutex_unlock(& dev
> -> mutex) here
> ./drivers/media/platform/vivid/vivid-kthread-cap.c:1002:1-13: kthread_stop here
> ./drivers/media/platform/vivid/vivid-kthread-cap.c:1004:1-11: mutex_lock(& dev
> -> mutex) here
>
> There are no other bugs detected.
>
> Do you have any idea how to improve it?
> Do we need that rule for regression testing in the upstream?

Based on Jann's suggestion, it seem like it could be interesting to find
these locking pauses, and then collect functions that are used in locks
and in lock pauses.  If a function is mostly used with locks held, then
using it in a lock pause could be a sign of a bug.  I will see if it turns
up anything interesting.

julia

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Coccinelle rule for CVE-2019-18683
  2020-04-08 22:26 ` Jann Horn
@ 2020-04-09 19:41   ` Alexander Popov
  2020-04-11  0:10     ` Alexander Popov
  0 siblings, 1 reply; 8+ messages in thread
From: Alexander Popov @ 2020-04-09 19:41 UTC (permalink / raw)
  To: Jann Horn
  Cc: Julia Lawall, Gilles Muller, Nicolas Palix, Michal Marek, cocci,
	kernel-hardening, Kees Cook, Hans Verkuil, Mauro Carvalho Chehab,
	Linux Media Mailing List, LKML, Markus Elfring

Jann, thanks for your reply!

On 09.04.2020 01:26, Jann Horn wrote:
> On Thu, Apr 9, 2020 at 12:01 AM Alexander Popov <alex.popov@linux.com> wrote:
>> CVE-2019-18683 refers to three similar vulnerabilities caused by the same
>> incorrect approach to locking that is used in vivid_stop_generating_vid_cap(),
>> vivid_stop_generating_vid_out(), and sdr_cap_stop_streaming().
>>
>> For fixes please see the commit 6dcd5d7a7a29c1e4 (media: vivid: Fix wrong
>> locking that causes race conditions on streaming stop).
>>
>> These three functions are called during streaming stopping with vivid_dev.mutex
>> locked. And they all do the same mistake while stopping their kthreads, which
>> need to lock this mutex as well. See the example from
>> vivid_stop_generating_vid_cap():
>>     /* shutdown control thread */
>>     vivid_grab_controls(dev, false);
>>     mutex_unlock(&dev->mutex);
>>     kthread_stop(dev->kthread_vid_cap);
>>     dev->kthread_vid_cap = NULL;
>>     mutex_lock(&dev->mutex);
>>
>> But when this mutex is unlocked, another vb2_fop_read() can lock it instead of
>> the kthread and manipulate the buffer queue. That causes use-after-free.
>>
>> I created a Coccinelle rule that detects mutex_unlock+kthread_stop+mutex_lock
>> within one function.
> [...]
>> mutex_unlock@unlock_p(E)
>> ...
>> kthread_stop@stop_p(...)
>> ...
>> mutex_lock@lock_p(E)
> 
> Is the kthread_stop() really special here? It seems to me like it's
> pretty much just a normal instance of the "temporarily dropping a
> lock" pattern - which does tend to go wrong quite often, but can also
> be correct.

Right, searching without kthread_stop() gives more cases.

> I think it would be interesting though to have a list of places that
> drop and then re-acquire a mutex/spinlock/... that was not originally
> acquired in the same block of code (but was instead originally
> acquired in an outer block, or by a parent function, or something like
> that). So things like this:

It's a very good idea. I tried it and got first results (described below).

> void X(...) {
>   mutex_lock(A);
>   for (...) {
>     ...
>     mutex_unlock(A);
>     ...
>     mutex_lock(A);
>     ...
>   }
>   mutex_unlock(A);
> }

I'm not an expert in SmPL yet. Don't know how to describe this case.

> or like this:
> 
> void X(...) {
>   ... [no mutex operations on A]
>   mutex_unlock(A);
>   ...
>   mutex_lock(A);
>   ...
> }

Yes, I adapted the rule for that easier case:

```
virtual report
virtual context

@race exists@
expression E;
position unlock_p;
position lock_p;
@@

... when != mutex_lock(E)
* mutex_unlock@unlock_p(E)
...
* mutex_lock@lock_p(E)

@script:python@
unlock_p << race.unlock_p;
lock_p << race.lock_p;
E << race.E;
@@

coccilib.report.print_report(unlock_p[0], 'see mutex_unlock(' + E + ') here')
coccilib.report.print_report(lock_p[0], 'see mutex_lock(' + E + ') here\n')
```

The command to run it:
  COCCI=./scripts/coccinelle/kthread_race.cocci make coccicheck MODE=context
It shows the code context around in a form of diff.

This rule found 195 matches. Not that much!

> But of course, there are places where this kind of behavior is
> correct; so such a script wouldn't just return report code, just code
> that could use a bit more scrutiny than normal. 

I've spent some time looking through the results.
Currently I see 3 types of cases.


1. Cases that look legit: a mutex is unlocked for some waiting or sleeping.

Example:
./fs/io_uring.c:7908:2-14: see mutex_unlock(& ctx -> uring_lock) here
./fs/io_uring.c:7910:2-12: see mutex_lock(& ctx -> uring_lock) here

diff -u -p ./fs/io_uring.c /tmp/nothing/fs/io_uring.c
--- ./fs/io_uring.c
+++ /tmp/nothing/fs/io_uring.c
@@ -7905,9 +7905,7 @@ static int __io_uring_register(struct io
 		 * to drop the mutex here, since no new references will come in
 		 * after we've killed the percpu ref.
 		 */
-		mutex_unlock(&ctx->uring_lock);
 		ret = wait_for_completion_interruptible(&ctx->completions[0]);
-		mutex_lock(&ctx->uring_lock);
 		if (ret) {
 			percpu_ref_resurrect(&ctx->refs);
 			ret = -EINTR;


Another example that looks legit:
./mm/ksm.c:2709:2-14: see mutex_unlock(& ksm_thread_mutex) here
./mm/ksm.c:2712:2-12: see mutex_lock(& ksm_thread_mutex) here

diff -u -p ./mm/ksm.c /tmp/nothing/mm/ksm.c
--- ./mm/ksm.c
+++ /tmp/nothing/mm/ksm.c
@@ -2706,10 +2706,8 @@ void ksm_migrate_page(struct page *newpa
 static void wait_while_offlining(void)
 {
 	while (ksm_run & KSM_RUN_OFFLINE) {
-		mutex_unlock(&ksm_thread_mutex);
 		wait_on_bit(&ksm_run, ilog2(KSM_RUN_OFFLINE),
 			    TASK_UNINTERRUPTIBLE);
-		mutex_lock(&ksm_thread_mutex);
 	}
 }


2. Weird cases that look like just avoiding a deadlock.

Example. This mutex is unlocked for a while by an interrupt handler:
./sound/pci/pcxhr/pcxhr_core.c:1210:3-15: see mutex_unlock(& mgr -> lock) here
./sound/pci/pcxhr/pcxhr_core.c:1212:3-13: see mutex_lock(& mgr -> lock) here

diff -u -p ./sound/pci/pcxhr/pcxhr_core.c /tmp/nothing/sound/pci/pcxhr/pcxhr_core.c
--- ./sound/pci/pcxhr/pcxhr_core.c
+++ /tmp/nothing/sound/pci/pcxhr/pcxhr_core.c
@@ -1207,9 +1207,7 @@ static void pcxhr_update_timer_pos(struc
 		}

 		if (elapsed) {
-			mutex_unlock(&mgr->lock);
 			snd_pcm_period_elapsed(stream->substream);
-			mutex_lock(&mgr->lock);
 		}
 	}
 }

Another weird example. Looks a bit similar to V4L2 bugs.

./drivers/net/wireless/broadcom/b43/main.c:4334:1-13: see mutex_unlock(& wl ->
mutex) here
./drivers/net/wireless/broadcom/b43/main.c:4338:1-11: see mutex_lock(& wl ->
mutex) here

diff -u -p ./drivers/net/wireless/broadcom/b43/main.c
/tmp/nothing/drivers/net/wireless/broadcom/b43/main.c
--- ./drivers/net/wireless/broadcom/b43/main.c
+++ /tmp/nothing/drivers/net/wireless/broadcom/b43/main.c
@@ -4331,11 +4331,9 @@ redo:
 		return dev;

 	/* Cancel work. Unlock to avoid deadlocks. */
-	mutex_unlock(&wl->mutex);
 	cancel_delayed_work_sync(&dev->periodic_work);
 	cancel_work_sync(&wl->tx_work);
 	b43_leds_stop(dev);
-	mutex_lock(&wl->mutex);
 	dev = wl->current_dev;
 	if (!dev || b43_status(dev) < B43_STAT_STARTED) {
 		/* Whoops, aliens ate up the device while we were unlocked. */


3. False positive cases.
The pointer to mutex changes between unlocking and locking.

Example:
./fs/ceph/caps.c:2103:4-16: see mutex_unlock(& session -> s_mutex) here
./fs/ceph/caps.c:2105:3-13: see mutex_lock(& session -> s_mutex) here

@@ -2100,9 +2094,7 @@ retry_locked:
 		if (session != cap->session) {
 			spin_unlock(&ci->i_ceph_lock);
 			if (session)
-				mutex_unlock(&session->s_mutex);
 			session = cap->session;
-			mutex_lock(&session->s_mutex);
 			goto retry;
 		}
 		if (cap->session->s_state < CEPH_MDS_SESSION_OPEN) {


I would be grateful for your ideas and feedback.
Alexander

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Cocci] Coccinelle rule for CVE-2019-18683
  2020-04-09 10:53 ` [Cocci] " Julia Lawall
@ 2020-04-09 19:56   ` Alexander Popov
  2020-04-09 20:45     ` Julia Lawall
  0 siblings, 1 reply; 8+ messages in thread
From: Alexander Popov @ 2020-04-09 19:56 UTC (permalink / raw)
  To: Julia Lawall
  Cc: Gilles Muller, Nicolas Palix, Michal Marek, cocci,
	kernel-hardening, Jann Horn, Kees Cook, Hans Verkuil,
	Mauro Carvalho Chehab, Linux Media Mailing List, LKML,
	Markus Elfring

On 09.04.2020 13:53, Julia Lawall wrote:
> On Thu, 9 Apr 2020, Alexander Popov wrote:
>> virtual report
>>
>> @race exists@
>> expression E;
>> position stop_p;
>> position unlock_p;
>> position lock_p;
>> @@
>>
>> mutex_unlock@unlock_p(E)
>> ...
> 
> It would be good to put when != mutex_lock(E) after the ... above.  Your
> rule doesn't actually prevent the lock from being retaken.

Thanks Julia! I used this trick in the second version of the rule that I've just
sent.

>> kthread_stop@stop_p(...)
>> ...
>> mutex_lock@lock_p(E)
>>
>> @script:python@
>> stop_p << race.stop_p;
>> unlock_p << race.unlock_p;
>> lock_p << race.lock_p;
>> E << race.E;
>> @@
>>
>> coccilib.report.print_report(unlock_p[0], 'mutex_unlock(' + E + ') here')
>> coccilib.report.print_report(stop_p[0], 'kthread_stop here')
>> coccilib.report.print_report(lock_p[0], 'mutex_lock(' + E + ') here\n')

...

> Based on Jann's suggestion, it seem like it could be interesting to find
> these locking pauses, and then collect functions that are used in locks
> and in lock pauses.  If a function is mostly used with locks held, then
> using it in a lock pause could be a sign of a bug.  I will see if it turns
> up anything interesting.

Do you mean collecting the behaviour that happens between unlocking and locking
and then analysing it somehow?

Best regards,
Alexander

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Cocci] Coccinelle rule for CVE-2019-18683
  2020-04-09 19:56   ` Alexander Popov
@ 2020-04-09 20:45     ` Julia Lawall
  0 siblings, 0 replies; 8+ messages in thread
From: Julia Lawall @ 2020-04-09 20:45 UTC (permalink / raw)
  To: Alexander Popov
  Cc: Gilles Muller, Nicolas Palix, Michal Marek, cocci,
	kernel-hardening, Jann Horn, Kees Cook, Hans Verkuil,
	Mauro Carvalho Chehab, Linux Media Mailing List, LKML,
	Markus Elfring

> >> kthread_stop@stop_p(...)
> >> ...
> >> mutex_lock@lock_p(E)
> >>
> >> @script:python@
> >> stop_p << race.stop_p;
> >> unlock_p << race.unlock_p;
> >> lock_p << race.lock_p;
> >> E << race.E;
> >> @@
> >>
> >> coccilib.report.print_report(unlock_p[0], 'mutex_unlock(' + E + ') here')
> >> coccilib.report.print_report(stop_p[0], 'kthread_stop here')
> >> coccilib.report.print_report(lock_p[0], 'mutex_lock(' + E + ') here\n')
>
> ...
>
> > Based on Jann's suggestion, it seem like it could be interesting to find
> > these locking pauses, and then collect functions that are used in locks
> > and in lock pauses.  If a function is mostly used with locks held, then
> > using it in a lock pause could be a sign of a bug.  I will see if it turns
> > up anything interesting.
>
> Do you mean collecting the behaviour that happens between unlocking and locking
> and then analysing it somehow?

Yes.  I have tried doing what I described, but I'm not sure that the
results are very reliable at the moment.

julia

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Coccinelle rule for CVE-2019-18683
  2020-04-09 19:41   ` Alexander Popov
@ 2020-04-11  0:10     ` Alexander Popov
  2020-04-11  5:07       ` Markus Elfring
  0 siblings, 1 reply; 8+ messages in thread
From: Alexander Popov @ 2020-04-11  0:10 UTC (permalink / raw)
  To: Jann Horn
  Cc: Julia Lawall, Gilles Muller, Nicolas Palix, Michal Marek, cocci,
	kernel-hardening, Kees Cook, Hans Verkuil, Mauro Carvalho Chehab,
	Linux Media Mailing List, LKML, Markus Elfring

On 09.04.2020 22:41, Alexander Popov wrote:
> On 09.04.2020 01:26, Jann Horn wrote:
>> On Thu, Apr 9, 2020 at 12:01 AM Alexander Popov <alex.popov@linux.com> wrote:
>>> CVE-2019-18683 refers to three similar vulnerabilities caused by the same
>>> incorrect approach to locking that is used in vivid_stop_generating_vid_cap(),
>>> vivid_stop_generating_vid_out(), and sdr_cap_stop_streaming().
>>>
>>> For fixes please see the commit 6dcd5d7a7a29c1e4 (media: vivid: Fix wrong
>>> locking that causes race conditions on streaming stop).
>>>
>>> These three functions are called during streaming stopping with vivid_dev.mutex
>>> locked. And they all do the same mistake while stopping their kthreads, which
>>> need to lock this mutex as well. See the example from
>>> vivid_stop_generating_vid_cap():
>>>     /* shutdown control thread */
>>>     vivid_grab_controls(dev, false);
>>>     mutex_unlock(&dev->mutex);
>>>     kthread_stop(dev->kthread_vid_cap);
>>>     dev->kthread_vid_cap = NULL;
>>>     mutex_lock(&dev->mutex);
>>>
>>> But when this mutex is unlocked, another vb2_fop_read() can lock it instead of
>>> the kthread and manipulate the buffer queue. That causes use-after-free.
>>>
>>> I created a Coccinelle rule that detects mutex_unlock+kthread_stop+mutex_lock
>>> within one function.
>> [...]
>>> mutex_unlock@unlock_p(E)
>>> ...
>>> kthread_stop@stop_p(...)
>>> ...
>>> mutex_lock@lock_p(E)
>>
>> Is the kthread_stop() really special here? It seems to me like it's
>> pretty much just a normal instance of the "temporarily dropping a
>> lock" pattern - which does tend to go wrong quite often, but can also
>> be correct.
> 
> Right, searching without kthread_stop() gives more cases.
> 
>> I think it would be interesting though to have a list of places that
>> drop and then re-acquire a mutex/spinlock/... that was not originally
>> acquired in the same block of code (but was instead originally
>> acquired in an outer block, or by a parent function, or something like
>> that). So things like this:

The following rule reported 146 matching cases, which might be interesting.

```
virtual report
virtual context

@race exists@
expression E;
position unlock_p;
position lock_p;
@@

... when != mutex_lock(E)
* mutex_unlock@unlock_p(E)
... when != schedule()
    when != schedule_timeout(...)
    when != cond_resched()
    when != wait_event(...)
    when != wait_event_timeout(...)
    when != wait_event_interruptible_timeout(...)
    when != wait_event_interruptible(...)
    when != msleep()
    when != msleep_interruptible(...)
* mutex_lock@lock_p(E)

@script:python@
unlock_p << race.unlock_p;
lock_p << race.lock_p;
E << race.E;
@@

coccilib.report.print_report(unlock_p[0], 'see mutex_unlock(' + E + ') here')
coccilib.report.print_report(lock_p[0], 'see mutex_lock(' + E + ') here\n')
```

Analysing each matching case would take a lot of time.

However, I'm focused on searching kernel security issues.
So I will filter out the code that:
 - is not enabled in popular kernel configurations,
 - doesn't create additional attack surface.
Then I'll take the time to analyse the rest of reported cases.

I'll inform you if I find any bug.

Best regards,
Alexander

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Coccinelle rule for CVE-2019-18683
  2020-04-11  0:10     ` Alexander Popov
@ 2020-04-11  5:07       ` Markus Elfring
  0 siblings, 0 replies; 8+ messages in thread
From: Markus Elfring @ 2020-04-11  5:07 UTC (permalink / raw)
  To: Alexander Popov, cocci, kernel-hardening
  Cc: Jann Horn, Julia Lawall, Gilles Muller, Nicolas Palix,
	Michal Marek, Kees Cook, Hans Verkuil, Mauro Carvalho Chehab,
	linux-media, LKML

> Analysing each matching case would take a lot of time.

How many efforts would you like to invest in adjusting the situation?

Will any more development possibilities picked up to reduce the presentation
of false positives by the mentioned source code analysis approach considerably?

Regards,
Markus

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2020-04-11  5:07 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-08 22:01 Coccinelle rule for CVE-2019-18683 Alexander Popov
2020-04-08 22:26 ` Jann Horn
2020-04-09 19:41   ` Alexander Popov
2020-04-11  0:10     ` Alexander Popov
2020-04-11  5:07       ` Markus Elfring
2020-04-09 10:53 ` [Cocci] " Julia Lawall
2020-04-09 19:56   ` Alexander Popov
2020-04-09 20:45     ` Julia Lawall

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).