All of lore.kernel.org
 help / color / mirror / Atom feed
[parent not found: <20040811093945.GA10667@elte.hu>]
[parent not found: <20040811010116.GL11200@holomorphy.com>]
* Scheduler fairness problem on 2.6 series (Attn: Nick Piggin and others)
@ 2004-08-07 21:53 spaminos-ker
  0 siblings, 0 replies; 37+ messages in thread
From: spaminos-ker @ 2004-08-07 21:53 UTC (permalink / raw)
  To: linux-kernel

Hey guys

I ran into this problem a few month ago while trying to migrate to 2.6
series at the time, it was using 2.6.1, I figured maybe it was fixed in the
latest kernel, so I tried 2.6.7 but it shows the same signs.

I am running in an interesting problem with, what seems to be, the
scheduler. I managed to create a test program to reproduce it.

This test system I use is an Athlon XP1800+ with 256 Mo RAM with
Redhat 9 installed
(but I could reproduce it the same way with Dual CPU systems).

I can reproduce this problem on 2.6.7 (and before, I could reproduce
it on a redhat kernel linux-2.4.20-19.9)

Kernel that I compiled with and without CONFIG_PREEMPT, SMP on or off,
to see no difference.

Everything is fine on my regular (non redhat) 2.4 kernel (same box).

Basically, here is what's happening on my server:
* I have a daemon running, serving http requests via worker threads.
* I have an other daemon running that does some sanity check on the
system, basically to monitor the box (and it mostly sleeps).

Under load, the sanity check daemon stops running properly. It seems
like all the CPU is given to the http server, even though they are
started the same way. This leads to all sorts of timeouts, not good
for a production server.

So how to reproduce the problem:

I run in one shell a script to check for slowdowns:
A=$(date +%s) ; while true ; do sleep 1s; B=$(date +%s) ; D=$(($B-$A)) ;
if [ $D -ge 3 ] ; then date ; echo ">>>>>>> delta = $D" ; fi ; A=$B ;
done

and in an other, my test program (see below for source code):
./cputest 2 10000 1000 512 &

note: I am running everything from ssh shells, not from X (don't know
if this has any effect for this kind of test).

And after a few seconds, I get this:

Mon Aug 2 14:40:27 PDT 2004
>>>>>>> delta = 3
Mon Aug 2 14:41:37 PDT 2004
>>>>>>> delta = 3
Mon Aug 2 14:41:51 PDT 2004
>>>>>>> delta = 3
Mon Aug 2 14:41:57 PDT 2004
>>>>>>> delta = 3
Mon Aug 2 14:42:12 PDT 2004
>>>>>>> delta = 3

The script takes 3 seconds to execute?!

That means that the script, that sleeps and doesn't use a lot of CPU,
stops getting cycles, for some reason.

On 2.4.x series kernel (from kernel.org, not redhat), I don't get any
output for the script, meaning the
fairness is good between processes (even if I start it with 60 worker
threads instead of 2).

If I try to increase the number of threads to 60 on 2.6.7, the script
gets stuck for very long period of time (I've seen 90 seconds).

I can post the .config file I used for compiling the 2.6.7 kernel, if
needed, but I figured this post was already big enough :)

After seeing a discussion on scheduling problems, I also tried different
patches, in particular the effect of an alternate scheduler from Nick
Piggin:
2.6.7 -> fails with 2 threads
2.6.7-bk20 -> fails with 2 threads
2.6.7-bk20-np8 -> works fine with 2 threads, fails with 20 threads
2.6.7-mm7 -> fails with 2 threads
2.6.7-mm7-np8 -> works fine with 2 threads, fails with 20 threads

Any clue?

Nicolas

compile the code with:

gcc -pthread -D_REENTRANT -lm cputest.c -o cputest
---------------------- cputest.c -----------------------------

#include <pthread.h>
#include <stdio.h>
#include <signal.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <math.h>

int nbthreads = 0;
int iterations = 0;
int run = 1;
int iterations2 = 0;

#define DEFAULTBUFSIZE 1
int globalpipe[2];

int buflen = DEFAULTBUFSIZE;

void exithandler(int s) {
        if (run) {
                run = 0;
                close(globalpipe[0]);
                close(globalpipe[1]);
        }
}

char *fname;


void *thread_code(void *arg) {
        time_t lasttime = time(NULL);
        int jj = (int)arg;
        int localiter = iterations/jj;
        int fd;

        char *buf;

        buf = alloca(buflen);
        if (buf == NULL) {
                printf("Failed to alloca!\n");
                exit(-1);
        }

        do {
                int i;
                time_t newtime;

                for(i=0; i < localiter && run; i++) {
                        int j;
                        double k = 1.0;
                        for(j=0; j < iterations2 && run; j++) {
                                k += j*k;
                                k = (1-k)*(1+k);
                                k = sqrt(k);
                        }
                        if (fd != -1) {
                                if (read(globalpipe[0], buf, buflen)
!= buflen) {
                                        printf("%d, aborted read\n",
jj);
                                }
                        }

                }
                sleep(1);
                newtime = time(NULL);
                printf("%d    delta = %d\n", (int)arg,
newtime-lasttime);
                lasttime = newtime;
        } while (run);
        return NULL;
}


int main(int argc, char **argv) {
        int i;
        pthread_t *mythreads;

        if (argc < 5)
                return -1;

        nbthreads = atoi(argv[1]);
        iterations = atoi(argv[2]);
        iterations2 = atoi(argv[3]);
        buflen = atoi(argv[4]);

        signal(SIGINT, exithandler);

        if (pipe(globalpipe) != 0) {
                return -2;
        }

        mythreads = (pthread_t *)calloc(nbthreads, sizeof(pthread_t));

        for(i=0; i < nbthreads; i++) {
                if (pthread_create(&mythreads[i], NULL, thread_code,
(void *)(i+1)) != 0) {
                        return -5;
                }
                printf("Started %d\n", i);
        }


        void * res;

        char *buf;
        int localbuflen;

        localbuflen = buflen * nbthreads;

        buf = alloca(localbuflen);
        if (buf == NULL) {
                printf("Failed to alloca!\n");
                run = 0;
                exit(-1);
        }

        sleep(3);

        do {
                write(globalpipe[1], buf, localbuflen);
        } while (run);


        for(i=0; i < nbthreads; i++) {
                printf("Waiting %d\n", i);
                pthread_join(mythreads[i], &res);
        }
        return 0;
}

-------------------- end cputest.c -------------------



^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, other threads:[~2004-09-13 20:11 UTC | newest]

Thread overview: 37+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <411D50AE.5020005@bigpond.net.au>
2004-08-17 23:19 ` Scheduler fairness problem on 2.6 series (Attn: Nick Piggin and others) spaminos-ker
2004-08-18  0:12   ` Peter Williams
2004-08-24 21:11     ` spaminos-ker
2004-08-24 23:04       ` Peter Williams
2004-08-24 23:22         ` Lee Revell
2004-08-26  2:30         ` spaminos-ker
2004-08-26  2:42           ` Peter Williams
2004-08-26  8:39             ` Peter Williams
2004-08-28  1:59               ` spaminos-ker
2004-08-29  0:21                 ` Peter Williams
2004-08-29  0:25                   ` Lee Revell
2004-08-29  0:45                     ` Lee Revell
2004-08-29  2:03                       ` Peter Williams
2004-08-29  2:28                         ` spaminos-ker
2004-08-29  4:53                           ` Peter Williams
2004-08-29  1:19                     ` spaminos-ker
2004-08-29  1:22                       ` Lee Revell
2004-08-29  1:31                         ` Peter Williams
2004-09-13 20:09                           ` spaminos-ker
2004-08-29  2:20                       ` Lee Revell
     [not found] <20040811093945.GA10667@elte.hu>
2004-08-17 23:08 ` spaminos-ker
     [not found] <20040811010116.GL11200@holomorphy.com>
2004-08-11  2:21 ` spaminos-ker
2004-08-11  2:23   ` William Lee Irwin III
2004-08-11  2:45     ` Peter Williams
2004-08-11  2:47       ` Peter Williams
2004-08-11  3:23         ` Peter Williams
2004-08-11  3:31           ` Con Kolivas
2004-08-11  3:46             ` Peter Williams
2004-08-11  3:44           ` Peter Williams
2004-08-13  0:13             ` spaminos-ker
2004-08-13  1:44               ` Peter Williams
2004-08-11  3:09   ` Con Kolivas
2004-08-11 10:24     ` Prakash K. Cheemplavam
2004-08-12  2:04     ` spaminos-ker
2004-08-12  2:24     ` spaminos-ker
2004-08-12  2:53       ` Con Kolivas
2004-08-07 21:53 spaminos-ker

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.