linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* web page on O(1) scheduler
@ 2003-05-21  6:49 David Mosberger
  2003-05-21  9:01 ` Arjan van de Ven
                   ` (2 more replies)
  0 siblings, 3 replies; 47+ messages in thread
From: David Mosberger @ 2003-05-21  6:49 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-ia64

Recently, I started to look into some odd performance behaviors of the
O(1) scheduler.  I decided to document what I found in a web page
at:

	http://www.hpl.hp.com/research/linux/kernel/o1.php

(it may take another couple of hours before the pages show up outside
the HP firewall, so if you get "page not found" at the moment, don't
be surprised).

I should say that I have no direct stake in the CPU scheduler (never
worked on it, not sure I ever would want to), but I feel that it's
worthwhile to at least document the O(1) scheduler a bit better.
Also, I do feel that the O(1) scheduler currently has rather poor
"long-term" properties.  It would be nice if some of those properties
could be improved without hurting the other excellent properties of
the current O(1) scheduler.

I think the web pages should be most relevant to the HPTC (high
performance technical computing) community, since this is the
community that is most likely affected by some of the performance
oddities of the O(1) scheduler.  Certainly anyone using OpenMP on
Intel platforms (x86 and ia64) may want to take a look.

Comments welcome.

	--david

^ permalink raw reply	[flat|nested] 47+ messages in thread
* RE: [Linux-ia64] Re: web page on O(1) scheduler
@ 2003-05-23 17:48 Boehm, Hans
  2003-05-23 18:04 ` Davide Libenzi
  0 siblings, 1 reply; 47+ messages in thread
From: Boehm, Hans @ 2003-05-23 17:48 UTC (permalink / raw)
  To: 'Arjan van de Ven', Hans Boehm
  Cc: MOSBERGER, DAVID (HP-PaloAlto,unix3), linux-kernel, linux-ia64

[-- Attachment #1: Type: text/plain, Size: 1526 bytes --]

Sorry about the typo and misnaming for the test program.  I attached the correct version that prints the right labels.

The results I posted did not use NPTL.  (Presumably OpenMP wasn't targeted at NPTL either.)  I don't think that NPTL has any bearing on the underlying issues that I mentioned, though path lengths are probably a bit shorter.  It should also handle contention substantially better, but that wasn't tested.

I did rerun the test case on a 900 MHz Itanium 2 machine with a more recent Debian installation with NPTL.  I get 200msecs (20nsecs/iter) with the custom lock, and 768 for pthreads.  (With static linking that decreases to 658 for pthreads.)  Pthreads (and/or some of the other infrastructure) is clearly getting better, but I don't think the difference will disappear.

I don't have a Xeon with NPTL handy.  On an old PPro, the results were 1133 and 4379 msecs for custom and NPTL respectively.

Hans

> -----Original Message-----
> From: Arjan van de Ven [mailto:arjanv@redhat.com]
> Sent: Friday, May 23, 2003 1:31 AM
> To: Hans Boehm
> Cc: Arjan van de Ven; davidm@hpl.hp.com; linux-kernel@vger.kernel.org;
> linux-ia64@linuxia64.org
> Subject: Re: [Linux-ia64] Re: web page on O(1) scheduler
> 
> 
> On Thu, May 22, 2003 at 06:07:46PM -0700, Hans Boehm wrote:
> > case.
> > 
> > On a 1GHz Itanium 2 I get
> > 
> > Custom lock: 180 msecs
> > Custom lock: 1382 msecs
> > 
> > On a 2GHz Xeon, I get
> > 
> > Custom lock: 646 msecs
> > Custom lock: 1659 msecs
> 
> is the pthreads one with nptl ?
> 


[-- Attachment #2: time_lock.c --]
[-- Type: application/octet-stream, Size: 1760 bytes --]

#include <pthread.h>
#include <stdio.h>
#include <sys/time.h>
#include "atomic_ops.h"

/* Timing code stolen from Ellis/Kovac/Boehm GCBench.			*/
#define currentTime() stats_rtclock()
#define elapsedTime(x) (x)

unsigned long
stats_rtclock( void )
{
  struct timeval t;
  struct timezone tz;

  if (gettimeofday( &t, &tz ) == -1)
    return 0;
  return (t.tv_sec * 1000 + t.tv_usec / 1000);
}

AO_TS_T my_spin_lock = AO_TS_INITIALIZER;

pthread_mutex_t my_pthread_lock = PTHREAD_MUTEX_INITIALIZER;

void spin_lock_ool(AO_TS_T *lock)
{
  /* Should repeatly retry the AO_test_and_set_acquire, perhaps		*/
  /* after trying a plain read.  Should "exponentially" back off	*/
  /* between tries.  For short time periods it should spin, for 	*/
  /* medium ones it should use sched_yield, and for longer ones usleep. */

  /* For now we punt, since this is a contention-free test.		*/
      abort();
}

inline void spin_lock(AO_TS_T *lock)
{
  if (__builtin_expect(AO_test_and_set_acquire(lock) != AO_TS_CLEAR, 0))
    spin_lock_ool(lock);
}

inline void spin_unlock(AO_TS_T *lock)
{
  AO_CLEAR(lock);
}

int main()
{
  unsigned long start_time, end_time;
  int i;
  
  start_time = currentTime();
  for (i = 0; i < 10000000; ++i) {
    spin_lock(&my_spin_lock);
    spin_unlock(&my_spin_lock);
  }
  end_time = currentTime();
  fprintf(stderr, "Custom lock: %lu msecs\n",
	  elapsedTime(end_time - start_time));
  start_time = currentTime();
  for (i = 0; i < 10000000; ++i) {
    pthread_mutex_lock(&my_pthread_lock);
    pthread_mutex_unlock(&my_pthread_lock);
  }
  end_time = currentTime();
  fprintf(stderr, "Pthread lock: %lu msecs\n",
	  elapsedTime(end_time - start_time));
  return 0;
}

^ permalink raw reply	[flat|nested] 47+ messages in thread
* RE: [Linux-ia64] Re: web page on O(1) scheduler
@ 2003-05-24  0:10 Boehm, Hans
  2003-05-24  0:20 ` Davide Libenzi
  0 siblings, 1 reply; 47+ messages in thread
From: Boehm, Hans @ 2003-05-24  0:10 UTC (permalink / raw)
  To: 'Davide Libenzi', Boehm, Hans
  Cc: 'Arjan van de Ven',
	Hans Boehm, MOSBERGER, DAVID (HP-PaloAlto,unix3),
	Linux Kernel Mailing List, linux-ia64

Pthread_spin_lock() under the NPTL version in RH9 does basically what my custom locks do in the uncontested case, aside from the function call.  But remember that this began with a discussion about whether it was reasonable for user locking code to explicitly yield rather than relying on pthreads to suspend the thread.  I don't think pthread_spin_lock is relevant in this context, for two reasons:

1) At least the RH9 version of pthread_spin_lock in NPTL literally spins and makes no attempt to yield or block.  This only makes sense at user level if you are 100% certain that the processors won't be overcommitted.  Otherwise there is little to be lost by blocking once you have spun for sufficiently long.  You could use pthread_spin_trylock and block explicitly, but that gets us back to custom blocking code.

2) AFAICT, pthread_spin_lock is currently a little too bleeding edge to be widely used.  I tried to time it, but failed.  Pthread.h doesn't include the declaration for pthread_spin_lock_t by default, at least not yet.  It doesn't seem to have a Linux man page, yet.  I tried to define the magic macro to get it declared, but that broke something else.

Hans

> -----Original Message-----
> From: Davide Libenzi [mailto:davidel@xmailserver.org]
> Sent: Friday, May 23, 2003 11:05 AM
> To: Boehm, Hans
> Cc: 'Arjan van de Ven'; Hans Boehm; MOSBERGER, DAVID
> (HP-PaloAlto,unix3); Linux Kernel Mailing List; 
> linux-ia64@linuxia64.org
> Subject: RE: [Linux-ia64] Re: web page on O(1) scheduler
> 
> 
> On Fri, 23 May 2003, Boehm, Hans wrote:
> 
> > Sorry about the typo and misnaming for the test program.  I 
> attached the correct version that prints the right labels.
> >
> > The results I posted did not use NPTL.  (Presumably OpenMP 
> wasn't targeted at NPTL either.)  I don't think that NPTL has 
> any bearing on the underlying issues that I mentioned, though 
> path lengths are probably a bit shorter.  It should also 
> handle contention substantially better, but that wasn't tested.
> >
> > I did rerun the test case on a 900 MHz Itanium 2 machine 
> with a more recent Debian installation with NPTL.  I get 
> 200msecs (20nsecs/iter) with the custom lock, and 768 for 
> pthreads.  (With static linking that decreases to 658 for 
> pthreads.)  Pthreads (and/or some of the other 
> infrastructure) is clearly getting better, but I don't think 
> the difference will disappear.
> 
> To make things more fair you should test against pthread 
> spinlocks. Also,
> for tight loops like that, even an extra call deep level 
> (that pthread is
> likely to do) is going to matter.
> 
> 
> 
> - Davide
> 

^ permalink raw reply	[flat|nested] 47+ messages in thread
* RE: [Linux-ia64] Re: web page on O(1) scheduler
@ 2003-05-24  0:53 Boehm, Hans
  2003-05-24  5:38 ` Davide Libenzi
  0 siblings, 1 reply; 47+ messages in thread
From: Boehm, Hans @ 2003-05-24  0:53 UTC (permalink / raw)
  To: 'Davide Libenzi', Boehm, Hans
  Cc: 'Arjan van de Ven',
	Hans Boehm, MOSBERGER, DAVID (HP-PaloAlto,unix3),
	Linux Kernel Mailing List, linux-ia64

Thanks for the pointer.  In the statically linked case, I get 200/568/345
for custom/pthread_mutex/pthread_spin.

I agree that this is not a fair comparison.  That was my point.  An implementation
with custom yield/sleep
code can do things that you can't do with blocking pthreads primitives at the
same performance.  (Of course pthreads_mutex_lock will win in other cases.)

Please forget about the abort() in the contention case.  I put that there for
brevity since it is not exercised by the test.  The intent was to time the
noncontention performance of a custom lock that first spins, then yields,
and then sleeps, as was stated in the comment.

You are forgetting two issues in your analysis of what pthreads is/should be doing
relative to the spin-lock-like code:

1) The unlock code is different.  If you potentially do a waitforunlocks()
in the locking code, you need to at least check whether the corresponding
notification is necessary when you unlock().  For NPTL that requires another
atomic operation, and hence another dozen to a couple of hundred cycles,
depending on the processor.  You need to look at both the lock and unlock
code.

2) (I hadn't mentioned this before.) The (standard interpretation of)
the memory barrier semantics of the pthreads primitives is too strong.
Arguably they need to be full memory barriers in both directions.
The pthread_spin_lock code inserts an extra full
memory barrier on IA64 as a result, instead of just
using the acquire barrier associated with the cmpxchg.acq instruction. 
(I think the spin unlock code doesn't do this.  One could argue that that's a bug,
though I would argue that the bug is really  in the pthreads spec.)

Hans

> -----Original Message-----
> From: Davide Libenzi [mailto:davidel@xmailserver.org]
> Sent: Friday, May 23, 2003 5:20 PM
> To: Boehm, Hans
> Cc: 'Arjan van de Ven'; Hans Boehm; MOSBERGER, DAVID
> (HP-PaloAlto,unix3); Linux Kernel Mailing List; 
> linux-ia64@linuxia64.org
> Subject: RE: [Linux-ia64] Re: web page on O(1) scheduler
> 
> 
> On Fri, 23 May 2003, Boehm, Hans wrote:
> 
> > Pthread_spin_lock() under the NPTL version in RH9 does 
> basically what my
> > custom locks do in the uncontested case, aside from the 
> function call.
> > But remember that this began with a discussion about whether it was
> > reasonable for user locking code to explicitly yield rather 
> than relying
> > on pthreads to suspend the thread.  I don't think 
> pthread_spin_lock is
> > relevant in this context, for two reasons:
> >
> > 1) At least the RH9 version of pthread_spin_lock in NPTL 
> literally spins
> > and makes no attempt to yield or block.  This only makes 
> sense at user
> > level if you are 100% certain that the processors won't be
> > overcommitted. Otherwise there is little to be lost by 
> blocking once you
> > have spun for sufficiently long.  You could use 
> pthread_spin_trylock and
> > block explicitly, but that gets us back to custom blocking code.
> 
> Yes, that would be a spinlock. Your code was basically a spinlock that
> instead of spinning was doing abort() in contention case. Again, you
> measured two different things. Even if the pthread mutex does 
> something
> very simple like :
> 
> 	spinlock(mtx->lock);
> 	while (mtx->busy) {
> 		spinunlock(mtx->lock);
> 		waitforunlocks();
> 		spinlock(mtx->lock);
> 	}
> 	mtx->busy++;
> 	spinunlock(mtx->lock);
> 
> Only the fact that this code likely reside inside a deeper 
> call lever will
> make you pay in a tight loop like your.
> 
> 
> > 2) AFAICT, pthread_spin_lock is currently a little too 
> bleeding edge to
> > be widely used.  I tried to time it, but failed.  Pthread.h doesn't
> > include the declaration for pthread_spin_lock_t by default, 
> at least not
> > yet.  It doesn't seem to have a Linux man page, yet.  I 
> tried to define
> > the magic macro to get it declared, but that broke something else.
> 
> $ gcc -D_GNU_SOURCE ...
> 
> 
> 
> - Davide
> 

^ permalink raw reply	[flat|nested] 47+ messages in thread

end of thread, other threads:[~2003-06-22 21:56 UTC | newest]

Thread overview: 47+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-05-21  6:49 web page on O(1) scheduler David Mosberger
2003-05-21  9:01 ` Arjan van de Ven
2003-05-21 10:40   ` [Linux-ia64] " Duraid Madina
2003-05-21 10:43     ` Christoph Hellwig
2003-05-21 13:12     ` Arjan van de Ven
2003-05-21 13:51       ` Olivier Galibert
2003-05-28 22:12         ` Bill Davidsen
     [not found]         ` <Pine.LNX.3.96.1030528180909.21414B-100000@gatekeeper.tmr.c om>
2003-05-29  5:59           ` Mike Galbraith
2003-06-02  8:05             ` Ingo Molnar
2003-06-04  4:07               ` Bill Davidsen
     [not found]             ` <Pine.LNX.4.44.0306020949520.3375-100000@localhost.localdom ain>
2003-06-02 13:51               ` Mike Galbraith
2003-06-04  3:52             ` Bill Davidsen
2003-06-04  4:55               ` David Schwartz
     [not found]             ` <Pine.LNX.3.96.1030603234616.16495B-100000@gatekeeper.tmr.c om>
2003-06-04  7:13               ` Mike Galbraith
2003-06-04 15:30                 ` Jan Harkes
2003-05-21 19:18       ` Duraid Madina
2003-05-21 20:03         ` Helge Hafting
2003-05-21 22:59         ` Alan Cox
2003-05-21 15:18   ` David Mosberger
2003-05-23  1:07   ` [Linux-ia64] " Hans Boehm
2003-05-23  8:30     ` Arjan van de Ven
2003-05-21  9:26 ` Mike Galbraith
2003-05-21  9:30   ` Mike Galbraith
2003-05-21 17:56   ` David Mosberger
2003-05-21 20:46     ` Mike Galbraith
2003-05-22  9:52     ` Mike Galbraith
2003-05-22 16:25       ` Mike Galbraith
2003-05-22 17:58         ` David Mosberger
2003-05-27 15:16         ` [case closed] " Mike Galbraith
2003-06-03 20:59       ` sched.c gives ICE [Was: Re: web page on O(1) scheduler] J.A. Magallon
2003-06-03 22:29         ` Mike Galbraith
2003-06-22 22:03         ` J.A. Magallon
2003-06-22 22:10           ` J.A. Magallon
2003-05-22  0:38   ` web page on O(1) scheduler Rik van Riel
     [not found]   ` <Pine.LNX.4.50L.0305212038120.5425-100000@imladris.surriel. com>
2003-05-22  5:52     ` Mike Galbraith
2003-05-22 14:47       ` Valdis.Kletnieks
2003-05-22 16:12         ` Mike Galbraith
2003-05-25  9:17       ` Mike Galbraith
2003-05-21 18:31 ` [Linux-ia64] " David Mosberger
2003-05-21 20:00   ` Cyclades Cyclom-Y ISA on 2.5.69 John Stoffel
2003-05-23 17:48 [Linux-ia64] Re: web page on O(1) scheduler Boehm, Hans
2003-05-23 18:04 ` Davide Libenzi
2003-05-24  0:10 Boehm, Hans
2003-05-24  0:20 ` Davide Libenzi
2003-05-24  0:53 Boehm, Hans
2003-05-24  5:38 ` Davide Libenzi
2003-05-24 14:43   ` Davide Libenzi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).