All of lore.kernel.org
 help / color / mirror / Atom feed
* potential NULL dereference in futex_wait_requeue_pi()
@ 2012-07-18 14:25 Dan Carpenter
  2012-07-18 15:31 ` Dave Jones
  2012-07-18 16:03 ` Darren Hart
  0 siblings, 2 replies; 7+ messages in thread
From: Dan Carpenter @ 2012-07-18 14:25 UTC (permalink / raw)
  To: Darren Hart; +Cc: linux-kernel

Hi Darren,

The patch 52400ba94675: "futex: add requeue_pi functionality" from
Apr 3, 2009, leads to the following warning:
kernel/futex.c:2373 futex_wait_requeue_pi()
	 error: potential NULL dereference 'pi_mutex'.

  2330          if (!q.rt_waiter) {
  2331                  /*
  2332                   * Got the lock. We might not be the anticipated owner if we
  2333                   * did a lock-steal - fix up the PI-state in that case.
  2334                   */
  2335                  if (q.pi_state && (q.pi_state->owner != current)) {
  2336                          spin_lock(q.lock_ptr);
  2337                          ret = fixup_pi_state_owner(uaddr2, &q, current);

pi_mutex is NULL here and it looks like fixup_pi_state_owner() can
return -EFAULT.

  2338                          spin_unlock(q.lock_ptr);
  2339                  }
  2340          } else {

	[snipped]

  2366          }
  2367  
  2368          /*
  2369           * If fixup_pi_state_owner() faulted and was unable to handle the
  2370           * fault, unlock the rt_mutex and return the fault to userspace.
  2371           */
  2372          if (ret == -EFAULT) {
  2373                  if (rt_mutex_owner(pi_mutex) == current)
                                           ^^^^^^^^
This will oops if pi_mutex is NULL.

  2374                          rt_mutex_unlock(pi_mutex);
  2375          } else if (ret == -EINTR) {

regards,
dan carpenter


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: potential NULL dereference in futex_wait_requeue_pi()
  2012-07-18 14:25 potential NULL dereference in futex_wait_requeue_pi() Dan Carpenter
@ 2012-07-18 15:31 ` Dave Jones
  2012-07-18 15:41   ` Darren Hart
  2012-07-18 16:03 ` Darren Hart
  1 sibling, 1 reply; 7+ messages in thread
From: Dave Jones @ 2012-07-18 15:31 UTC (permalink / raw)
  To: Dan Carpenter; +Cc: Darren Hart, linux-kernel

On Wed, Jul 18, 2012 at 05:25:14PM +0300, Dan Carpenter wrote:
 > Hi Darren,
 > 
 > The patch 52400ba94675: "futex: add requeue_pi functionality" from
 > Apr 3, 2009, leads to the following warning:
 > kernel/futex.c:2373 futex_wait_requeue_pi()
 > 	 error: potential NULL dereference 'pi_mutex'.

This sounds like the oops I saw that I reported at https://lkml.org/lkml/2012/7/13/328

	Dave

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: potential NULL dereference in futex_wait_requeue_pi()
  2012-07-18 15:31 ` Dave Jones
@ 2012-07-18 15:41   ` Darren Hart
  2012-07-18 15:55     ` Dan Carpenter
  0 siblings, 1 reply; 7+ messages in thread
From: Darren Hart @ 2012-07-18 15:41 UTC (permalink / raw)
  To: Dave Jones, Dan Carpenter, linux-kernel



On 07/18/2012 08:31 AM, Dave Jones wrote:
> On Wed, Jul 18, 2012 at 05:25:14PM +0300, Dan Carpenter wrote:
>  > Hi Darren,
>  > 
>  > The patch 52400ba94675: "futex: add requeue_pi functionality" from
>  > Apr 3, 2009, leads to the following warning:
>  > kernel/futex.c:2373 futex_wait_requeue_pi()
>  > 	 error: potential NULL dereference 'pi_mutex'.
> 
> This sounds like the oops I saw that I reported at https://lkml.org/lkml/2012/7/13/328

I'm looking into the report Dave sent today (delayed from piles of
urgent issues after returning from vacation, my apologies).

Dan, under which workload are you seeing this?

-- 
Darren Hart
Intel Open Source Technology Center
Yocto Project - Linux Kernel



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: potential NULL dereference in futex_wait_requeue_pi()
  2012-07-18 15:41   ` Darren Hart
@ 2012-07-18 15:55     ` Dan Carpenter
  0 siblings, 0 replies; 7+ messages in thread
From: Dan Carpenter @ 2012-07-18 15:55 UTC (permalink / raw)
  To: Darren Hart; +Cc: Dave Jones, linux-kernel

On Wed, Jul 18, 2012 at 08:41:38AM -0700, Darren Hart wrote:
> 
> 
> On 07/18/2012 08:31 AM, Dave Jones wrote:
> > On Wed, Jul 18, 2012 at 05:25:14PM +0300, Dan Carpenter wrote:
> >  > Hi Darren,
> >  > 
> >  > The patch 52400ba94675: "futex: add requeue_pi functionality" from
> >  > Apr 3, 2009, leads to the following warning:
> >  > kernel/futex.c:2373 futex_wait_requeue_pi()
> >  > 	 error: potential NULL dereference 'pi_mutex'.
> > 
> > This sounds like the oops I saw that I reported at https://lkml.org/lkml/2012/7/13/328
> 
> I'm looking into the report Dave sent today (delayed from piles of
> urgent issues after returning from vacation, my apologies).
> 
> Dan, under which workload are you seeing this?
> 

This was a static checker warning.  It was Dave who saw the oops
from syscall fuzzing with trinity.

regards,
dan carpenter


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: potential NULL dereference in futex_wait_requeue_pi()
  2012-07-18 14:25 potential NULL dereference in futex_wait_requeue_pi() Dan Carpenter
  2012-07-18 15:31 ` Dave Jones
@ 2012-07-18 16:03 ` Darren Hart
  2012-07-18 18:01   ` Dave Jones
  1 sibling, 1 reply; 7+ messages in thread
From: Darren Hart @ 2012-07-18 16:03 UTC (permalink / raw)
  To: Dan Carpenter; +Cc: linux-kernel, Dave Jones, Thomas Gleixner

On 07/18/2012 07:25 AM, Dan Carpenter wrote:
> Hi Darren,
> 
> The patch 52400ba94675: "futex: add requeue_pi functionality" from
> Apr 3, 2009, leads to the following warning:
> kernel/futex.c:2373 futex_wait_requeue_pi()
> 	 error: potential NULL dereference 'pi_mutex'.
> 
>   2330          if (!q.rt_waiter) {
>   2331                  /*
>   2332                   * Got the lock. We might not be the anticipated owner if we
>   2333                   * did a lock-steal - fix up the PI-state in that case.
>   2334                   */
>   2335                  if (q.pi_state && (q.pi_state->owner != current)) {
>   2336                          spin_lock(q.lock_ptr);
>   2337                          ret = fixup_pi_state_owner(uaddr2, &q, current);
> 
> pi_mutex is NULL here and it looks like fixup_pi_state_owner() can
> return -EFAULT.
>
> 
>   2338                          spin_unlock(q.lock_ptr);
>   2339                  }
>   2340          } else {
> 
> 	[snipped]
> 
>   2366          }
>   2367  
>   2368          /*
>   2369           * If fixup_pi_state_owner() faulted and was unable to handle the
>   2370           * fault, unlock the rt_mutex and return the fault to userspace.
>   2371           */
>   2372          if (ret == -EFAULT) {
>   2373                  if (rt_mutex_owner(pi_mutex) == current)
>                                            ^^^^^^^^
> This will oops if pi_mutex is NULL.
> 
>   2374                          rt_mutex_unlock(pi_mutex);
>   2375          } else if (ret == -EINTR) {

Nice Dan, thanks for taking a closer look. This appears to be a simple fix, can
you try the following:


futex: Test for pi_mutex on fault in futex_wait_requeue_pi

If fixup_pi_state_owner() faults, pi_mutex may be NULL. Test
for pi_mutex != NULL before testing the owner against current
and possibly unlocking it.

Signed-off-by: Darren Hart <dvhart@linux.intel.com>
CC: Dave Jones <davej@redhat.com>
CC: Dan Carpenter <dan.carpenter@oracle.com>
CC: Thomas Gleixner <tglx@linutronix.de>

diff --git a/kernel/futex.c b/kernel/futex.c
index e2b0fb9..05018bf 100644
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -2370,7 +2370,7 @@ static int futex_wait_requeue_pi(u32 __user *uaddr, unsigned int flags,
 	 * fault, unlock the rt_mutex and return the fault to userspace.
 	 */
 	if (ret == -EFAULT) {
-		if (rt_mutex_owner(pi_mutex) == current)
+		if (pi_mutex && rt_mutex_owner(pi_mutex) == current)
 			rt_mutex_unlock(pi_mutex);
 	} else if (ret == -EINTR) {
 		/*


-- 
Darren Hart
Intel Open Source Technology Center
Yocto Project - Linux Kernel



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: potential NULL dereference in futex_wait_requeue_pi()
  2012-07-18 16:03 ` Darren Hart
@ 2012-07-18 18:01   ` Dave Jones
  2012-07-19  4:21     ` Darren Hart
  0 siblings, 1 reply; 7+ messages in thread
From: Dave Jones @ 2012-07-18 18:01 UTC (permalink / raw)
  To: Darren Hart; +Cc: Dan Carpenter, linux-kernel, Thomas Gleixner

On Wed, Jul 18, 2012 at 09:03:22AM -0700, Darren Hart wrote:
 
 > > This will oops if pi_mutex is NULL.
 > > 
 > >   2374                          rt_mutex_unlock(pi_mutex);
 > >   2375          } else if (ret == -EINTR) {
 > 
 > Nice Dan, thanks for taking a closer look. This appears to be a simple fix, can
 > you try the following:
 > 
 > 
 > futex: Test for pi_mutex on fault in futex_wait_requeue_pi
 > 
 > If fixup_pi_state_owner() faults, pi_mutex may be NULL. Test
 > for pi_mutex != NULL before testing the owner against current
 > and possibly unlocking it.
 > 
 > Signed-off-by: Darren Hart <dvhart@linux.intel.com>
 > CC: Dave Jones <davej@redhat.com>
 > CC: Dan Carpenter <dan.carpenter@oracle.com>
 > CC: Thomas Gleixner <tglx@linutronix.de>
 > 
 > diff --git a/kernel/futex.c b/kernel/futex.c
 > index e2b0fb9..05018bf 100644
 > --- a/kernel/futex.c
 > +++ b/kernel/futex.c
 > @@ -2370,7 +2370,7 @@ static int futex_wait_requeue_pi(u32 __user *uaddr, unsigned int flags,
 >  	 * fault, unlock the rt_mutex and return the fault to userspace.
 >  	 */
 >  	if (ret == -EFAULT) {
 > -		if (rt_mutex_owner(pi_mutex) == current)
 > +		if (pi_mutex && rt_mutex_owner(pi_mutex) == current)
 >  			rt_mutex_unlock(pi_mutex);
 >  	} else if (ret == -EINTR) {
 >  		/*

Doesn't fix the oops for me unfortunatly.  It looks like it happens further up,
so this might be a spearate bug after all.

I added this..

@@ -2344,7 +2351,13 @@ static int futex_wait_requeue_pi(u32 __user *uaddr, unsigned int flags,
                 * the pi_state.
                 */
                WARN_ON(!&q.pi_state);
+
                pi_mutex = &q.pi_state->pi_mutex;
+               if (pi_mutex == NULL) {
+                       ret = -EINVAL;
+                       goto out;
+               }
+
                ret = rt_mutex_finish_proxy_lock(pi_mutex, to, &rt_waiter, 1);


But that didn't seem to fix it either.  Somehow we still do this ..


BUG: unable to handle kernel NULL pointer dereference at 0000000000000028
IP: [<ffffffff810d68be>] __lock_acquire+0x5e/0x1ae0

	lock_acquire+0xad/0x220
	_raw_spin_lock+0x46/0x80
	rt_mutex_finish_proxy_lock+0x34/0xe0
	futex_wait_requeue_pi.constprop.20+0x2e5/0x400
	do_futex+0xea/0xa20
	sys_futex+0x107/0x1a0
	system_call_fastpath+0x1a/0x1f

Ah, could it somehow be that we have a pi_mutex here, but it hasn't been initialised ?

The code: line fingers this as the failure in kernel/lockdep.c

        if (lock->key == &__lockdep_no_validate__)
    3f9e:       49 8b 07                mov    (%r15),%rax

r15 (lock) is somehow '0x28' here, which is why the NULL check I added didn't trigger.

This isn't helped by the fact that there seems to be another unrelated bug in futexes
that trinity triggers.  If you want to try this, running it with "-c futex" will reproduce
it very quickly.

	Dave


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: potential NULL dereference in futex_wait_requeue_pi()
  2012-07-18 18:01   ` Dave Jones
@ 2012-07-19  4:21     ` Darren Hart
  0 siblings, 0 replies; 7+ messages in thread
From: Darren Hart @ 2012-07-19  4:21 UTC (permalink / raw)
  To: Dave Jones, Dan Carpenter, linux-kernel, Thomas Gleixner



On 07/18/2012 11:01 AM, Dave Jones wrote:
> On Wed, Jul 18, 2012 at 09:03:22AM -0700, Darren Hart wrote:
>  
>  > > This will oops if pi_mutex is NULL.
>  > > 
>  > >   2374                          rt_mutex_unlock(pi_mutex);
>  > >   2375          } else if (ret == -EINTR) {
>  > 
>  > Nice Dan, thanks for taking a closer look. This appears to be a simple fix, can
>  > you try the following:
>  > 
>  > 
>  > futex: Test for pi_mutex on fault in futex_wait_requeue_pi
>  > 
>  > If fixup_pi_state_owner() faults, pi_mutex may be NULL. Test
>  > for pi_mutex != NULL before testing the owner against current
>  > and possibly unlocking it.
>  > 
>  > Signed-off-by: Darren Hart <dvhart@linux.intel.com>
>  > CC: Dave Jones <davej@redhat.com>
>  > CC: Dan Carpenter <dan.carpenter@oracle.com>
>  > CC: Thomas Gleixner <tglx@linutronix.de>
>  > 
>  > diff --git a/kernel/futex.c b/kernel/futex.c
>  > index e2b0fb9..05018bf 100644
>  > --- a/kernel/futex.c
>  > +++ b/kernel/futex.c
>  > @@ -2370,7 +2370,7 @@ static int futex_wait_requeue_pi(u32 __user *uaddr, unsigned int flags,
>  >  	 * fault, unlock the rt_mutex and return the fault to userspace.
>  >  	 */
>  >  	if (ret == -EFAULT) {
>  > -		if (rt_mutex_owner(pi_mutex) == current)
>  > +		if (pi_mutex && rt_mutex_owner(pi_mutex) == current)
>  >  			rt_mutex_unlock(pi_mutex);
>  >  	} else if (ret == -EINTR) {
>  >  		/*
> 
> Doesn't fix the oops for me unfortunatly.  It looks like it happens further up,
> so this might be a spearate bug after all.
> 
> I added this..
> 
> @@ -2344,7 +2351,13 @@ static int futex_wait_requeue_pi(u32 __user *uaddr, unsigned int flags,
>                  * the pi_state.
>                  */
>                 WARN_ON(!&q.pi_state);
> +
>                 pi_mutex = &q.pi_state->pi_mutex;
> +               if (pi_mutex == NULL) {
> +                       ret = -EINVAL;
> +                       goto out;
> +               }
> +
>                 ret = rt_mutex_finish_proxy_lock(pi_mutex, to, &rt_waiter, 1);
> 
> 
> But that didn't seem to fix it either.  Somehow we still do this ..
> 
> 
> BUG: unable to handle kernel NULL pointer dereference at 0000000000000028

I was hunting for a lock->owner-> dereference in that path, as I think
owner can be at 0x28, but it doesn't work with your trace below.

> IP: [<ffffffff810d68be>] __lock_acquire+0x5e/0x1ae0
> 
> 	lock_acquire+0xad/0x220
> 	_raw_spin_lock+0x46/0x80
> 	rt_mutex_finish_proxy_lock+0x34/0xe0
> 	futex_wait_requeue_pi.constprop.20+0x2e5/0x400
> 	do_futex+0xea/0xa20
> 	sys_futex+0x107/0x1a0
> 	system_call_fastpath+0x1a/0x1f
> 
> Ah, could it somehow be that we have a pi_mutex here, but it hasn't been initialised ?
> 
> The code: line fingers this as the failure in kernel/lockdep.c
> 
>         if (lock->key == &__lockdep_no_validate__)
>     3f9e:       49 8b 07                mov    (%r15),%rax
> 
> r15 (lock) is somehow '0x28' here, which is why the NULL check I added didn't trigger.

OK, so for debug a < 0xc0000000 check might be useful.

> 
> This isn't helped by the fact that there seems to be another unrelated bug in futexes
> that trinity triggers.  If you want to try this, running it with "-c futex" will reproduce
> it very quickly.

Gave it a shot on the laptop (stock Fedora 17 kernel)... first it killed
my external display and reverted to the internal screen (nice trick
that) then it just locked up hard. Took about 2 seconds for the first, a
few more for the second. All unprivileged... nasty test case you have
here. I was considering adding fuzz testing to futextest, but I believe
you have it covered rather nicely here :-)

I'll start in on some instrumentation and further analysis after the
morning meetings (ugh).

-- 
Darren Hart
Intel Open Source Technology Center
Yocto Project - Linux Kernel



^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2012-07-19  4:23 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-07-18 14:25 potential NULL dereference in futex_wait_requeue_pi() Dan Carpenter
2012-07-18 15:31 ` Dave Jones
2012-07-18 15:41   ` Darren Hart
2012-07-18 15:55     ` Dan Carpenter
2012-07-18 16:03 ` Darren Hart
2012-07-18 18:01   ` Dave Jones
2012-07-19  4:21     ` Darren Hart

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.