linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dinakar Guniguntala <dino@in.ibm.com>
To: david singleton <dsingleton@mvista.com>
Cc: linux-kernel@vger.kernel.org, Ingo Molnar <mingo@elte.hu>
Subject: Re: Recursion bug in -rt
Date: Fri, 16 Dec 2005 01:14:34 +0530	[thread overview]
Message-ID: <20051215194434.GA4741@in.ibm.com> (raw)
In-Reply-To: <9175126B-6D06-11DA-AA1B-000A959BB91E@mvista.com>

On Wed, Dec 14, 2005 at 05:03:13PM -0800, david singleton wrote:
> Dinakar,
> you may be correct, since reverting the change in down_futex seems 
> to work.

Well it did not. It ran for much longer than previously but still 
hung up. 

Well we have a very basic problem in the current implementation. 

Currently if a thread calls pthread_mutex_lock on the same lock 
(normal, non recursive lock) twice without unlocking in between, the 
application hangs. Which is the right behaviour.
However if the same thing is done with a non recursive robust mutex,
it manages to acquire the lock.

I see many problems here (I am assuming that the right behaviour
with robust mutexes is for application to ultimately block
indefinitely in the kernel)

1. In down_futex we do the following check

	if (owner_task != current)
		down_try_futex(lock, owner_task->thread_info __EIP__);

   In the above scenario, the thread would have acquired the uncontended
   lock first time around in userspace. The second time it tries to
   acquire the same mutex, because of the above check, does not
   call down_try_futex and hence will not initialize the lock structure
   in the kernel. It then goes to __down_interruptible where it is 
   granted the lock, which is wrong.

   So IMO the above check is not right. However removing this check
   is not the end of story.  This time it gets to task_blocks_on_lock 
   and tries to grab the task->pi_lock of the owvner which is itself
   and results in a system hang. (Assuming CONFIG_DEBUG_DEADLOCKS
   is not set). So it looks like we need to add some check to 
   prevent this below in case lock_owner happens to be current.

    _raw_spin_lock(&lock_owner(lock)->task->pi_lock);


> However, I'm  wondering if you've hit a bug that Dave Carlson is 
> reporting that he's tracked down to an inline in the glibc patch.

Yes I noticed that, basically it looks like INLINE_SYSCALL, on error
returns -1 with the error code in errno. Whereas we expect the
return code to be the same as the kernel return code. Are you referring
to this or something else ??

However even with all of the above fixes (remove the check in down_futex
as mentioned above, add a check in task_blocks_on_lock and the glibc
changes) my test program continues to hang up the system, though it 
takes a lot longer to recreate the problem now

[snip]

> 1) Why did the library call into the kernel if the calling thread
> owned the lock?

This is something I still havent figured out and leads me to believe
that we still have a tiny race race somewhere

	-Dinakar

  reply	other threads:[~2005-12-15 19:27 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-12-14 22:39 Recursion bug in -rt Dinakar Guniguntala
2005-12-15  1:03 ` david singleton
2005-12-15 19:44   ` Dinakar Guniguntala [this message]
2005-12-15 20:40     ` David Singleton
2005-12-16  0:02     ` david singleton
2005-12-16 18:42       ` Dinakar Guniguntala
2005-12-16 21:26         ` David Singleton
2005-12-19 11:56           ` Dinakar Guniguntala
2005-12-19 20:11             ` David Singleton
2005-12-15 19:00 ` David Singleton
2005-12-15 19:52   ` Dinakar Guniguntala
2005-12-20 13:19   ` Ingo Molnar
2005-12-20 15:50     ` Dinakar Guniguntala
2005-12-20 17:43       ` Esben Nielsen
2005-12-20 19:33         ` Steven Rostedt
2005-12-20 20:42           ` Esben Nielsen
2005-12-20 21:20             ` Steven Rostedt
2005-12-20 21:55               ` david singleton
2005-12-20 22:56                 ` Esben Nielsen
2005-12-20 23:12                   ` Steven Rostedt
2005-12-20 23:55                     ` Esben Nielsen
2005-12-22  4:37                       ` david singleton
2005-12-20 22:43               ` Esben Nielsen
2005-12-20 22:59                 ` Steven Rostedt
2006-01-03  1:54       ` david singleton
2006-01-05  2:14       ` david singleton
2006-01-05  9:43         ` Esben Nielsen
2006-01-05 17:11           ` david singleton
2006-01-05 17:47             ` Esben Nielsen
2006-01-05 18:26               ` david singleton
2006-01-07  2:40               ` robust futex deadlock detection patch david singleton
     [not found]                 ` <a36005b50601071145y7e2ead9an4a4ca7896f35a85e@mail.gmail.com>
2006-01-07 19:49                   ` Ulrich Drepper
2006-01-09  9:23                 ` Esben Nielsen
2006-01-09 20:01                   ` David Singleton
2006-01-09 20:16                     ` Esben Nielsen
2006-01-09 21:08                       ` Steven Rostedt
2006-01-09 21:19                         ` Esben Nielsen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20051215194434.GA4741@in.ibm.com \
    --to=dino@in.ibm.com \
    --cc=dsingleton@mvista.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).