* potential NULL dereference in futex_wait_requeue_pi()
@ 2012-07-18 14:25 Dan Carpenter
2012-07-18 15:31 ` Dave Jones
2012-07-18 16:03 ` Darren Hart
0 siblings, 2 replies; 7+ messages in thread
From: Dan Carpenter @ 2012-07-18 14:25 UTC (permalink / raw)
To: Darren Hart; +Cc: linux-kernel
Hi Darren,
The patch 52400ba94675: "futex: add requeue_pi functionality" from
Apr 3, 2009, leads to the following warning:
kernel/futex.c:2373 futex_wait_requeue_pi()
error: potential NULL dereference 'pi_mutex'.
2330 if (!q.rt_waiter) {
2331 /*
2332 * Got the lock. We might not be the anticipated owner if we
2333 * did a lock-steal - fix up the PI-state in that case.
2334 */
2335 if (q.pi_state && (q.pi_state->owner != current)) {
2336 spin_lock(q.lock_ptr);
2337 ret = fixup_pi_state_owner(uaddr2, &q, current);
pi_mutex is NULL here and it looks like fixup_pi_state_owner() can
return -EFAULT.
2338 spin_unlock(q.lock_ptr);
2339 }
2340 } else {
[snipped]
2366 }
2367
2368 /*
2369 * If fixup_pi_state_owner() faulted and was unable to handle the
2370 * fault, unlock the rt_mutex and return the fault to userspace.
2371 */
2372 if (ret == -EFAULT) {
2373 if (rt_mutex_owner(pi_mutex) == current)
^^^^^^^^
This will oops if pi_mutex is NULL.
2374 rt_mutex_unlock(pi_mutex);
2375 } else if (ret == -EINTR) {
regards,
dan carpenter
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: potential NULL dereference in futex_wait_requeue_pi()
2012-07-18 14:25 potential NULL dereference in futex_wait_requeue_pi() Dan Carpenter
@ 2012-07-18 15:31 ` Dave Jones
2012-07-18 15:41 ` Darren Hart
2012-07-18 16:03 ` Darren Hart
1 sibling, 1 reply; 7+ messages in thread
From: Dave Jones @ 2012-07-18 15:31 UTC (permalink / raw)
To: Dan Carpenter; +Cc: Darren Hart, linux-kernel
On Wed, Jul 18, 2012 at 05:25:14PM +0300, Dan Carpenter wrote:
> Hi Darren,
>
> The patch 52400ba94675: "futex: add requeue_pi functionality" from
> Apr 3, 2009, leads to the following warning:
> kernel/futex.c:2373 futex_wait_requeue_pi()
> error: potential NULL dereference 'pi_mutex'.
This sounds like the oops I saw that I reported at https://lkml.org/lkml/2012/7/13/328
Dave
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: potential NULL dereference in futex_wait_requeue_pi()
2012-07-18 15:31 ` Dave Jones
@ 2012-07-18 15:41 ` Darren Hart
2012-07-18 15:55 ` Dan Carpenter
0 siblings, 1 reply; 7+ messages in thread
From: Darren Hart @ 2012-07-18 15:41 UTC (permalink / raw)
To: Dave Jones, Dan Carpenter, linux-kernel
On 07/18/2012 08:31 AM, Dave Jones wrote:
> On Wed, Jul 18, 2012 at 05:25:14PM +0300, Dan Carpenter wrote:
> > Hi Darren,
> >
> > The patch 52400ba94675: "futex: add requeue_pi functionality" from
> > Apr 3, 2009, leads to the following warning:
> > kernel/futex.c:2373 futex_wait_requeue_pi()
> > error: potential NULL dereference 'pi_mutex'.
>
> This sounds like the oops I saw that I reported at https://lkml.org/lkml/2012/7/13/328
I'm looking into the report Dave sent today (delayed from piles of
urgent issues after returning from vacation, my apologies).
Dan, under which workload are you seeing this?
--
Darren Hart
Intel Open Source Technology Center
Yocto Project - Linux Kernel
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: potential NULL dereference in futex_wait_requeue_pi()
2012-07-18 15:41 ` Darren Hart
@ 2012-07-18 15:55 ` Dan Carpenter
0 siblings, 0 replies; 7+ messages in thread
From: Dan Carpenter @ 2012-07-18 15:55 UTC (permalink / raw)
To: Darren Hart; +Cc: Dave Jones, linux-kernel
On Wed, Jul 18, 2012 at 08:41:38AM -0700, Darren Hart wrote:
>
>
> On 07/18/2012 08:31 AM, Dave Jones wrote:
> > On Wed, Jul 18, 2012 at 05:25:14PM +0300, Dan Carpenter wrote:
> > > Hi Darren,
> > >
> > > The patch 52400ba94675: "futex: add requeue_pi functionality" from
> > > Apr 3, 2009, leads to the following warning:
> > > kernel/futex.c:2373 futex_wait_requeue_pi()
> > > error: potential NULL dereference 'pi_mutex'.
> >
> > This sounds like the oops I saw that I reported at https://lkml.org/lkml/2012/7/13/328
>
> I'm looking into the report Dave sent today (delayed from piles of
> urgent issues after returning from vacation, my apologies).
>
> Dan, under which workload are you seeing this?
>
This was a static checker warning. It was Dave who saw the oops
from syscall fuzzing with trinity.
regards,
dan carpenter
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: potential NULL dereference in futex_wait_requeue_pi()
2012-07-18 14:25 potential NULL dereference in futex_wait_requeue_pi() Dan Carpenter
2012-07-18 15:31 ` Dave Jones
@ 2012-07-18 16:03 ` Darren Hart
2012-07-18 18:01 ` Dave Jones
1 sibling, 1 reply; 7+ messages in thread
From: Darren Hart @ 2012-07-18 16:03 UTC (permalink / raw)
To: Dan Carpenter; +Cc: linux-kernel, Dave Jones, Thomas Gleixner
On 07/18/2012 07:25 AM, Dan Carpenter wrote:
> Hi Darren,
>
> The patch 52400ba94675: "futex: add requeue_pi functionality" from
> Apr 3, 2009, leads to the following warning:
> kernel/futex.c:2373 futex_wait_requeue_pi()
> error: potential NULL dereference 'pi_mutex'.
>
> 2330 if (!q.rt_waiter) {
> 2331 /*
> 2332 * Got the lock. We might not be the anticipated owner if we
> 2333 * did a lock-steal - fix up the PI-state in that case.
> 2334 */
> 2335 if (q.pi_state && (q.pi_state->owner != current)) {
> 2336 spin_lock(q.lock_ptr);
> 2337 ret = fixup_pi_state_owner(uaddr2, &q, current);
>
> pi_mutex is NULL here and it looks like fixup_pi_state_owner() can
> return -EFAULT.
>
>
> 2338 spin_unlock(q.lock_ptr);
> 2339 }
> 2340 } else {
>
> [snipped]
>
> 2366 }
> 2367
> 2368 /*
> 2369 * If fixup_pi_state_owner() faulted and was unable to handle the
> 2370 * fault, unlock the rt_mutex and return the fault to userspace.
> 2371 */
> 2372 if (ret == -EFAULT) {
> 2373 if (rt_mutex_owner(pi_mutex) == current)
> ^^^^^^^^
> This will oops if pi_mutex is NULL.
>
> 2374 rt_mutex_unlock(pi_mutex);
> 2375 } else if (ret == -EINTR) {
Nice Dan, thanks for taking a closer look. This appears to be a simple fix, can
you try the following:
futex: Test for pi_mutex on fault in futex_wait_requeue_pi
If fixup_pi_state_owner() faults, pi_mutex may be NULL. Test
for pi_mutex != NULL before testing the owner against current
and possibly unlocking it.
Signed-off-by: Darren Hart <dvhart@linux.intel.com>
CC: Dave Jones <davej@redhat.com>
CC: Dan Carpenter <dan.carpenter@oracle.com>
CC: Thomas Gleixner <tglx@linutronix.de>
diff --git a/kernel/futex.c b/kernel/futex.c
index e2b0fb9..05018bf 100644
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -2370,7 +2370,7 @@ static int futex_wait_requeue_pi(u32 __user *uaddr, unsigned int flags,
* fault, unlock the rt_mutex and return the fault to userspace.
*/
if (ret == -EFAULT) {
- if (rt_mutex_owner(pi_mutex) == current)
+ if (pi_mutex && rt_mutex_owner(pi_mutex) == current)
rt_mutex_unlock(pi_mutex);
} else if (ret == -EINTR) {
/*
--
Darren Hart
Intel Open Source Technology Center
Yocto Project - Linux Kernel
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: potential NULL dereference in futex_wait_requeue_pi()
2012-07-18 16:03 ` Darren Hart
@ 2012-07-18 18:01 ` Dave Jones
2012-07-19 4:21 ` Darren Hart
0 siblings, 1 reply; 7+ messages in thread
From: Dave Jones @ 2012-07-18 18:01 UTC (permalink / raw)
To: Darren Hart; +Cc: Dan Carpenter, linux-kernel, Thomas Gleixner
On Wed, Jul 18, 2012 at 09:03:22AM -0700, Darren Hart wrote:
> > This will oops if pi_mutex is NULL.
> >
> > 2374 rt_mutex_unlock(pi_mutex);
> > 2375 } else if (ret == -EINTR) {
>
> Nice Dan, thanks for taking a closer look. This appears to be a simple fix, can
> you try the following:
>
>
> futex: Test for pi_mutex on fault in futex_wait_requeue_pi
>
> If fixup_pi_state_owner() faults, pi_mutex may be NULL. Test
> for pi_mutex != NULL before testing the owner against current
> and possibly unlocking it.
>
> Signed-off-by: Darren Hart <dvhart@linux.intel.com>
> CC: Dave Jones <davej@redhat.com>
> CC: Dan Carpenter <dan.carpenter@oracle.com>
> CC: Thomas Gleixner <tglx@linutronix.de>
>
> diff --git a/kernel/futex.c b/kernel/futex.c
> index e2b0fb9..05018bf 100644
> --- a/kernel/futex.c
> +++ b/kernel/futex.c
> @@ -2370,7 +2370,7 @@ static int futex_wait_requeue_pi(u32 __user *uaddr, unsigned int flags,
> * fault, unlock the rt_mutex and return the fault to userspace.
> */
> if (ret == -EFAULT) {
> - if (rt_mutex_owner(pi_mutex) == current)
> + if (pi_mutex && rt_mutex_owner(pi_mutex) == current)
> rt_mutex_unlock(pi_mutex);
> } else if (ret == -EINTR) {
> /*
Doesn't fix the oops for me unfortunatly. It looks like it happens further up,
so this might be a spearate bug after all.
I added this..
@@ -2344,7 +2351,13 @@ static int futex_wait_requeue_pi(u32 __user *uaddr, unsigned int flags,
* the pi_state.
*/
WARN_ON(!&q.pi_state);
+
pi_mutex = &q.pi_state->pi_mutex;
+ if (pi_mutex == NULL) {
+ ret = -EINVAL;
+ goto out;
+ }
+
ret = rt_mutex_finish_proxy_lock(pi_mutex, to, &rt_waiter, 1);
But that didn't seem to fix it either. Somehow we still do this ..
BUG: unable to handle kernel NULL pointer dereference at 0000000000000028
IP: [<ffffffff810d68be>] __lock_acquire+0x5e/0x1ae0
lock_acquire+0xad/0x220
_raw_spin_lock+0x46/0x80
rt_mutex_finish_proxy_lock+0x34/0xe0
futex_wait_requeue_pi.constprop.20+0x2e5/0x400
do_futex+0xea/0xa20
sys_futex+0x107/0x1a0
system_call_fastpath+0x1a/0x1f
Ah, could it somehow be that we have a pi_mutex here, but it hasn't been initialised ?
The code: line fingers this as the failure in kernel/lockdep.c
if (lock->key == &__lockdep_no_validate__)
3f9e: 49 8b 07 mov (%r15),%rax
r15 (lock) is somehow '0x28' here, which is why the NULL check I added didn't trigger.
This isn't helped by the fact that there seems to be another unrelated bug in futexes
that trinity triggers. If you want to try this, running it with "-c futex" will reproduce
it very quickly.
Dave
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: potential NULL dereference in futex_wait_requeue_pi()
2012-07-18 18:01 ` Dave Jones
@ 2012-07-19 4:21 ` Darren Hart
0 siblings, 0 replies; 7+ messages in thread
From: Darren Hart @ 2012-07-19 4:21 UTC (permalink / raw)
To: Dave Jones, Dan Carpenter, linux-kernel, Thomas Gleixner
On 07/18/2012 11:01 AM, Dave Jones wrote:
> On Wed, Jul 18, 2012 at 09:03:22AM -0700, Darren Hart wrote:
>
> > > This will oops if pi_mutex is NULL.
> > >
> > > 2374 rt_mutex_unlock(pi_mutex);
> > > 2375 } else if (ret == -EINTR) {
> >
> > Nice Dan, thanks for taking a closer look. This appears to be a simple fix, can
> > you try the following:
> >
> >
> > futex: Test for pi_mutex on fault in futex_wait_requeue_pi
> >
> > If fixup_pi_state_owner() faults, pi_mutex may be NULL. Test
> > for pi_mutex != NULL before testing the owner against current
> > and possibly unlocking it.
> >
> > Signed-off-by: Darren Hart <dvhart@linux.intel.com>
> > CC: Dave Jones <davej@redhat.com>
> > CC: Dan Carpenter <dan.carpenter@oracle.com>
> > CC: Thomas Gleixner <tglx@linutronix.de>
> >
> > diff --git a/kernel/futex.c b/kernel/futex.c
> > index e2b0fb9..05018bf 100644
> > --- a/kernel/futex.c
> > +++ b/kernel/futex.c
> > @@ -2370,7 +2370,7 @@ static int futex_wait_requeue_pi(u32 __user *uaddr, unsigned int flags,
> > * fault, unlock the rt_mutex and return the fault to userspace.
> > */
> > if (ret == -EFAULT) {
> > - if (rt_mutex_owner(pi_mutex) == current)
> > + if (pi_mutex && rt_mutex_owner(pi_mutex) == current)
> > rt_mutex_unlock(pi_mutex);
> > } else if (ret == -EINTR) {
> > /*
>
> Doesn't fix the oops for me unfortunatly. It looks like it happens further up,
> so this might be a spearate bug after all.
>
> I added this..
>
> @@ -2344,7 +2351,13 @@ static int futex_wait_requeue_pi(u32 __user *uaddr, unsigned int flags,
> * the pi_state.
> */
> WARN_ON(!&q.pi_state);
> +
> pi_mutex = &q.pi_state->pi_mutex;
> + if (pi_mutex == NULL) {
> + ret = -EINVAL;
> + goto out;
> + }
> +
> ret = rt_mutex_finish_proxy_lock(pi_mutex, to, &rt_waiter, 1);
>
>
> But that didn't seem to fix it either. Somehow we still do this ..
>
>
> BUG: unable to handle kernel NULL pointer dereference at 0000000000000028
I was hunting for a lock->owner-> dereference in that path, as I think
owner can be at 0x28, but it doesn't work with your trace below.
> IP: [<ffffffff810d68be>] __lock_acquire+0x5e/0x1ae0
>
> lock_acquire+0xad/0x220
> _raw_spin_lock+0x46/0x80
> rt_mutex_finish_proxy_lock+0x34/0xe0
> futex_wait_requeue_pi.constprop.20+0x2e5/0x400
> do_futex+0xea/0xa20
> sys_futex+0x107/0x1a0
> system_call_fastpath+0x1a/0x1f
>
> Ah, could it somehow be that we have a pi_mutex here, but it hasn't been initialised ?
>
> The code: line fingers this as the failure in kernel/lockdep.c
>
> if (lock->key == &__lockdep_no_validate__)
> 3f9e: 49 8b 07 mov (%r15),%rax
>
> r15 (lock) is somehow '0x28' here, which is why the NULL check I added didn't trigger.
OK, so for debug a < 0xc0000000 check might be useful.
>
> This isn't helped by the fact that there seems to be another unrelated bug in futexes
> that trinity triggers. If you want to try this, running it with "-c futex" will reproduce
> it very quickly.
Gave it a shot on the laptop (stock Fedora 17 kernel)... first it killed
my external display and reverted to the internal screen (nice trick
that) then it just locked up hard. Took about 2 seconds for the first, a
few more for the second. All unprivileged... nasty test case you have
here. I was considering adding fuzz testing to futextest, but I believe
you have it covered rather nicely here :-)
I'll start in on some instrumentation and further analysis after the
morning meetings (ugh).
--
Darren Hart
Intel Open Source Technology Center
Yocto Project - Linux Kernel
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2012-07-19 4:23 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-07-18 14:25 potential NULL dereference in futex_wait_requeue_pi() Dan Carpenter
2012-07-18 15:31 ` Dave Jones
2012-07-18 15:41 ` Darren Hart
2012-07-18 15:55 ` Dan Carpenter
2012-07-18 16:03 ` Darren Hart
2012-07-18 18:01 ` Dave Jones
2012-07-19 4:21 ` Darren Hart
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.