linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Peter Wächtler" <pwaechtler@loewe-komp.de>
To: Rusty Russell <rusty@rustcorp.com.au>
Cc: Martin Wirth <martin.wirth@dlr.de>,
	linux-kernel@vger.kernel.org, Ulrich Drepper <drepper@redhat.com>
Subject: Re: [PATCH] Futexes IV (Fast Lightweight Userspace Semaphores)
Date: Sat, 16 Mar 2002 20:48:52 +0100	[thread overview]
Message-ID: <3C93A1A4.2FDCA723@loewe-komp.de> (raw)
In-Reply-To: <E16m1oK-0006oy-00@wagner.rustcorp.com.au>

Rusty Russell wrote:
> 
> In message <3C922005.50608@loewe-komp.de> you write:
> > Martin Wirth wrote:
> > > Rusty Russell wrote:
> > >
> > >>
> > >> Discussions with Ulrich have reaffirmed my opinion that pthreads are
> > >> crap.  Hence I'm not all that tempted to warp the (nice, clean,
> > >> usable) futex code too far to meet pthreads' wierd needs.
> > >>
> > > Crap or not, there are tons of software based on pthreads and at least
> > > the NGPT team says that Linus
> > > agreed to implement for necessary kernel-infrastructure for a full, fast
> > > pthread implementation.
> 
> Let me clarify my "pthreads are crap" statement.
> 
> I firmly believe that there is a place for clone & futex threaded
> programs, which is not met by pthreads for cleanliness and performance
> reasons, and that such programs will become increasingly important.
> 
> Therefore I refuse to penalise such progressive programs so we can
> have standards compliance.  Hence my insistance on a clean, minimal,
> useful interface.
> 
> > >> However, it's not too hard to implement condition variables using an
> > >> unavailable mutex, if we go for "full" semaphores: ie. not just
> > >> mutexes.  It requires a bit more of a stretch for kernel atomic ops...
> > >>
> > > A full semaphore is nice, but not a full replacement for a waitqueue (or
> > > a pthread condition variable brr..).
> > > For the semaphore you always have to assure that the ups and downs are
> > > balanced, what is not the case
> > > for the condition variable.
> > >
> >
> > also remember pthread_cond_broadcast - waking up _all_ waiting threads.
> > If the woken up threads check their condition and go to sleep again, is
> > up to them ( read: the standard mandates that _all_ get woken up)
> >
> > pthread_cond_signal notifies _one_ thread - which one depends on implementati
> on
> > ( I would like to see a priority based decision )
> 

I have to correct myself. This is the description of SUSV2 about
pthread_cond_signal/broadcast:

---snip---
These two functions are used to unblock threads blocked on a 
condition variable. 

The pthread_cond_signal() call unblocks at least one of the threads 
that are blocked on the specified condition variable cond (if any 
threads are blocked on cond). 

The pthread_cond_broadcast() call unblocks all threads currently 
blocked on the specified condition variable cond. 

If more than one thread is blocked on a condition variable, the 
scheduling policy determines the order in which threads are unblocked. 
When each thread unblocked as a result of a pthread_cond_signal() or 
pthread_cond_broadcast() returns from its call to pthread_cond_wait() 
or pthread_cond_timedwait(), the thread owns the mutex with which it 
called pthread_cond_wait() or pthread_cond_timedwait(). The thread(s) 
that are unblocked contend for the mutex according to the scheduling 
policy (if applicable), and as if each had called pthread_mutex_lock(). 

The pthread_cond_signal() or pthread_cond_broadcast() functions may 
be called by a thread whether or not it currently owns the mutex that 
threads calling pthread_cond_wait() or pthread_cond_timedwait() have 
associated with the condition variable during their waits; however, if 
predictable scheduling behaviour is required, then that mutex is locked 
by the thread calling pthread_cond_signal() or pthread_cond_broadcast(). 

The pthread_cond_signal() and pthread_cond_broadcast() functions have 
no effect if there are no threads currently blocked on cond.

RETURN VALUE

  If successful, the pthread_cond_signal() and pthread_cond_broadcast() 
  functions return zero. Otherwise, an error number is returned to
  indicate the error. 

ERRORS

  The pthread_cond_signal() and pthread_cond_broadcast() function 
  may fail if: 

 [EINVAL]
   The value cond does not refer to an initialised condition variable. 

   These functions will not return an error code of [EINTR]. 
--snip---

So the semantic of a condvar implies that when returning from a 
"succesful" wait [ i.e. not ETIMEDOUT] , the thread owns the mutex. 
Therefore the scheduler _should_ only wake up the highest priority 
waiting thread - it does not matter if we signal or broadcast!
It's the same operation in effect. Perhaps the broadcast is there 
for implementations that wouldn't queue up (or wake up) the waiters 
in priority order?

For this, I think, kernel support is best, since the waiters get 
woken up in priority order giving wake_one semantics.

For pthread_cond_timedwait() a kernel timer is necessary.
So making the signal/broadcast a syscall that does NOT lead to
a context switch would be benefical. At the next scheduling point
the kernel decides whom to wake up, also checking for timed waiters
to return with ETIMEDOUT.

Then there is the issue with a crashing process holding locks.
I think on IRIX the waiters get a trap causing them to die - what
else one could do?

[after writing and deleting some pseudo code]

I think the condvars are best implemented in shmem + kernel semaphores.
The only issue is a pthread_cond_timedwait - but a semaphore op IS
interruptible. Besides alarm(2)/setitimer(2) - what are other timeout 
mechanisms in Linux? 

In QNX Neutrino you have TimerTimeout() to arm a kernel timeout
for any blocking state (to avoid the window in alarm/timer_settime
and the blocking function call)

Without this you can hardly implement a reliable timeout (with
sub second resolution) for pthread_cond_timedwait or sigtimedwait.

So for PTHREAD_PROCESS_PRIVATE one could use futexes - on
PTHREAD_PROCESS_SHARED it has to reside in the kernel anyway and you
naturally have to live with context switches and a performance hit.

Now the problem with N:M threading model: here it's necessary to
prevent the kernel from blocking the whole process? 
Well, what is pthread_setconcurrency(int new_level) for?


And, what is so bad about condvars? How would you implement a 
typical consumer/producer problem?

To cite Stevens (Vol II, page 165): 
... "mutexes are for locking and cannot be used for waiting."
page 167: "A mutex is for locking and a condition variable is for 
waiting. ... and both are needed"

-- 
-----------------------------------------------------------------------
Peter Waechtler

  parent reply	other threads:[~2002-03-16 19:55 UTC|newest]

Thread overview: 84+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2002-03-13  9:12 [PATCH] Futexes IV (Fast Lightweight Userspace Semaphores) Martin Wirth
2002-03-13 19:41 ` Bill Davidsen
2002-03-13 19:52   ` Dave McCracken
2002-03-13 22:17     ` Bill Davidsen
2002-03-13 20:06   ` Alan Cox
2002-03-15  7:31 ` Rusty Russell
2002-03-15  8:41   ` Martin Wirth
2002-03-15 15:29     ` Hubertus Franke
2002-03-15 16:23     ` Peter Wächtler
2002-03-16  0:12       ` Rusty Russell
2002-03-16 11:23         ` Martin Wirth
2002-03-18  0:52           ` Ulrich Drepper
2002-03-19  3:28             ` Rusty Russell
2002-03-19  4:05               ` Ulrich Drepper
2002-03-20  6:20                 ` Rusty Russell
2002-03-20 10:42                   ` Peter Wächtler
2002-03-20 17:20                     ` Ulrich Drepper
2002-03-19  8:34               ` Martin Wirth
2002-03-20  6:45                 ` Rusty Russell
2002-03-21  6:48                   ` Martin Wirth
2002-03-24 18:25                     ` Peter Wächtler
2002-03-25  2:28                       ` Rusty Russell
2002-03-25  4:46                         ` Rusty Russell
2002-03-25 11:56                           ` Peter Wächtler
2002-03-26  1:02                             ` Rusty Russell
2002-03-26  8:17                               ` Martin Wirth
2002-03-26 23:10                                 ` Rusty Russell
2002-03-27 21:05                                   ` Hubertus Franke
2002-03-27 23:53                                     ` Rusty Russell
2002-03-25  9:47                         ` Peter Wächtler
2002-03-16 19:48         ` Peter Wächtler [this message]
2002-03-17  6:50         ` Rusty Russell
  -- strict thread matches above, loose matches on Subject: below --
2002-03-05  7:01 Rusty Russell
2002-03-05 22:39 ` Davide Libenzi
2002-03-05 23:16   ` Hubertus Franke
2002-03-05 23:26     ` Davide Libenzi
2002-03-05 23:37       ` Peter Svensson
2002-03-05 23:50         ` Davide Libenzi
2002-03-08  0:07       ` Richard Henderson
2002-03-06  1:46   ` Rusty Russell
2002-03-06  2:03     ` Davide Libenzi
2002-03-08 18:07 ` Linus Torvalds
2002-03-08 19:03   ` Hubertus Franke
2002-03-08 19:22     ` Linus Torvalds
2002-03-08 20:29       ` Hubertus Franke
2002-03-08 20:48         ` Matthew Kirkwood
2002-03-08 21:02         ` Linus Torvalds
2002-03-08 23:15           ` Hubertus Franke
2002-03-08 23:36             ` Alan Cox
2002-03-08 23:41             ` Linus Torvalds
2002-03-08 23:56               ` Hubertus Franke
2002-03-09  2:12                 ` Linus Torvalds
2002-03-11 14:14                   ` Hubertus Franke
2002-03-09  0:03               ` H. Peter Anvin
2002-03-09  1:15                 ` Alan Cox
2002-03-10 19:41                   ` Linus Torvalds
2002-03-11 20:49                     ` Pavel Machek
2002-03-10 19:58                   ` Martin J. Bligh
2002-03-10 20:40                     ` Alan Cox
2002-03-10 20:28                       ` Martin J. Bligh
2002-03-10 21:05                         ` Alan Cox
2002-03-13  7:40                   ` Rusty Russell
2002-03-13 16:37                     ` Alan Cox
2002-03-12  9:35                 ` Helge Hafting
2002-03-08 20:40       ` Alan Cox
2002-03-08 20:57         ` Linus Torvalds
2002-03-08 23:43           ` H. Peter Anvin
2002-03-08 22:55         ` Hubertus Franke
2002-03-08 23:38           ` Alan Cox
2002-03-08 23:44           ` H. Peter Anvin
2002-03-08 20:47       ` george anzinger
2002-03-08 23:02         ` Hubertus Franke
2002-03-08 23:47           ` george anzinger
2002-03-09  1:11             ` Alan Cox
2002-03-09  1:20             ` Linus Torvalds
2002-03-09  4:49     ` Rusty Russell
2002-03-11 22:45       ` Linus Torvalds
2002-03-11 23:12         ` Hubertus Franke
2002-03-12  7:20         ` Rusty Russell
2002-03-12 14:56           ` Hubertus Franke
2002-03-13  4:02             ` Rusty Russell
2002-03-12 17:17           ` Linus Torvalds
2002-03-13  2:57             ` Rusty Russell
2002-03-09  4:51 ` Rusty Russell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3C93A1A4.2FDCA723@loewe-komp.de \
    --to=pwaechtler@loewe-komp.de \
    --cc=drepper@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=martin.wirth@dlr.de \
    --cc=rusty@rustcorp.com.au \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).