linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* High CPU usage of scheduler?
@ 2012-04-26 22:08 Dave Johansen
  2012-04-27  3:10 ` Yong Zhang
  0 siblings, 1 reply; 9+ messages in thread
From: Dave Johansen @ 2012-04-26 22:08 UTC (permalink / raw)
  To: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 3318 bytes --]

I am looking into moving an application from RHEL 5 to RHEL 6 and I
noticed an unexpected increase in CPU usage. A little digging has led
me to believe that the scheduler may be the culprit.

I created the attached test_select_work.c file to test this out. I
compiled it with the following command on RHEL 5:

cc test_select_work.c -O2 -DSLEEP_TYPE=0 -Wall -Wextra -lm -lpthread
-o test_select_work

I then played with the parameters until the execution time per
iteration was about 1 ms on a Dell Precision m6500.

I got the following result on RHEL 5:

    ./test_select_work 1000 10000 300 4
    time_per_iteration: min: 911.5 us avg: 913.7 us max: 917.1 us stddev: 2.4 us
    ./test_select_work 1000 10000 300 8
    time_per_iteration: min: 1802.6 us avg: 1803.9 us max: 1809.1 us
stddev: 2.1 us
    ./test_select_work 1000 10000 300 40
    time_per_iteration: min: 7580.4 us avg: 8567.3 us max: 9022.0 us
stddev: 299.6 us

And the following on RHEL 6:

    ./test_select_work 1000 10000 300 4
    time_per_iteration: min: 914.6 us avg: 975.7 us max: 1034.5 us
stddev: 50.0 us
    ./test_select_work 1000 10000 300 8
    time_per_iteration: min: 1683.9 us avg: 1771.8 us max: 1810.8 us
stddev: 43.4 us
    ./test_select_work 1000 10000 300 40
    time_per_iteration: min: 7997.1 us avg: 8709.1 us max: 9061.8 us
stddev: 310.0 us

On both versions, these results were about what I expected with the
average amount of time per iteration scaling relatively linearly. I
then recompiled with -DSLEEP_TYPE=1 and got the following results on
RHEL 5:

    ./test_select_work 1000 10000 300 4
    time_per_iteration: min: 1803.3 us avg: 1902.8 us max: 2001.5 us
stddev: 113.8 us
    ./test_select_work 1000 10000 300 8
    time_per_iteration: min: 1997.1 us avg: 2002.0 us max: 2010.8 us
stddev: 5.0 us
    ./test_select_work 1000 10000 300 40
    time_per_iteration: min: 6958.4 us avg: 8397.9 us max: 9423.7 us
stddev: 619.7 us

And the following results on RHEL 6:

    ./test_select_work 1000 10000 300 4
    time_per_iteration: min: 2107.1 us avg: 2143.1 us max: 2177.7 us
stddev: 30.3 us
    ./test_select_work 1000 10000 300 8
    time_per_iteration: min: 2903.3 us avg: 2903.8 us max: 2904.3 us
stddev: 0.3 us
    ./test_select_work 1000 10000 300 40
    time_per_iteration: min: 8877.7.1 us avg: 9016.3 us max: 9112.6 us
stddev: 62.9 us

On RHEL 5, the results were about what I expected (4 threads taking
twice as long because of the 1 ms sleep but the 8 threads taking the
same amount of time since each thread is now sleeping for about half
the time, and a still fairly linear increase with the 40 threads).

However, with RHEL 6, the time taken with 4 threads increased by about
15% more than the expected doubling and the 8 thread case increased by
about 45% more than the expected slight increase. The increase in the
4 thread case seems to be that RHEL 6 is actually sleeping for a
handful of microseconds more than 1 ms while RHEL 5 is only sleep
about 900 us, but this doesn't explain the unexpectedly large increase
in the 8 and 40 thread cases.

I saw similar types of behaviour with all 3 -DSLEEP_TYPE values. I
also tried playing with the scheduler parameters in sysctl, but
nothing seemed to have a significant impact on the results. Any ideas
on how I can further diagnose this issue?

Thanks,
Dave

[-- Attachment #2: test_select_work.c --]
[-- Type: text/x-csrc, Size: 4624 bytes --]

#include <float.h>
#include <math.h>
#include <poll.h>
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/select.h>
#include <sys/time.h>


// Uncomment to use select inside the loop of each of the threads
#define SLEEP_TYPE 1
// Uncomment to use poll inside the loop of each of the threads
//#define SLEEP_TYPE 2
// Uncomment to use usleep inside the loop of each of the threads
//#define SLEEP_TYPE 3


struct thread_info {
	int sleep_time;
	int num_iterations;
	int work_size;
};

void *do_test(void *arg)
{
	const struct thread_info *tinfo = (struct thread_info *)arg;

	const int sleep_time = tinfo->sleep_time;
	const int num_iterations = tinfo->num_iterations;
	const int work_size = tinfo->work_size;

#if SLEEP_TYPE == 1
	// Data for calling select
	struct timeval ts;
#elif SLEEP_TYPE == 2
	// Data for calling poll
	struct pollfd pfd;
#endif
	// Data for doing work
	int *buf;
	int pseed;
	int inum, bnum;
	// Data for tracking the time
  struct timeval before, after;
	long long *diff;
 
	buf = calloc(work_size, sizeof(int));
	diff = malloc(sizeof(unsigned long long));

#if SLEEP_TYPE == 2
	// Initialize the poll data
	pfd.fd = 0;
	pfd.events = 0;
#endif

	// Get the time before starting the processing
  gettimeofday(&before, NULL);

	// Do the requested number of iterations
	for (inum=0; inum<num_iterations; ++inum) {
#if SLEEP_TYPE == 1
		ts.tv_sec = 0;
		ts.tv_usec = sleep_time;
		select(0, 0, 0, 0, &ts);
#elif SLEEP_TYPE == 2
		poll(&pfd, 1, sleep_time / 1000);
#elif SLEEP_TYPE == 3
		usleep(sleep_time);
#else
		// Get rid of warning about unused variable
		(void)sleep_time;
#endif

		// Fill in a buffer with random numbers (taken from latt.c by Jens Axboe <jens.axboe@oracle.com>)
		pseed = 1;
		for (bnum=0; bnum<work_size; ++bnum) {
			pseed = pseed * 1103515245 + 12345;
			buf[bnum] = (pseed / 65536) % 32768;
		}
	}

	// Get the time after starting the processing
  gettimeofday(&after, NULL);

	// Calculate the delta time
  *diff = 1000000LL * (after.tv_sec - before.tv_sec);
  *diff += after.tv_usec - before.tv_usec;

	// Clean up the data
	free(buf);

	return diff;
}

int main(int argc, char **argv)
{
	if (argc < 4) {
		printf("Usage: %s <sleep_time> <num_iterations> <work_size> <num_threads>\n", argv[0]);
		return -1;
	}

	struct thread_info tinfo;
	int s, tnum, num_threads;
	pthread_attr_t attr;
	pthread_t *threads;
	long long *res;
	long long *times;

	// Get the parameters
	tinfo.sleep_time = atoi(argv[1]);
	tinfo.num_iterations = atoi(argv[2]);
	tinfo.work_size = atoi(argv[3]) * 1024;
	num_threads = atoi(argv[4]);

	// Initialize the thread creation attributes
	s = pthread_attr_init(&attr);
  if (s != 0) {
		printf("Error initializing thread attributes\n");
		return -2;
	}

	// Allocate the memory to track the threads
 	threads = calloc(num_threads, sizeof(pthread_t));
 	times = calloc(num_threads, sizeof(unsigned long long));
	if (threads == NULL) {
		printf("Error allocating memory to track threads\n");
		return -3;
	}

	// Start all of the threads
	for (tnum=0; tnum<num_threads; ++tnum) {
		s = pthread_create(&threads[tnum], &attr, &do_test, &tinfo);

		if (s != 0) {
			printf("Error starting thread\n");
			return -4;
		}
	}

	// Clean up the thread creation attributes
	s = pthread_attr_destroy(&attr);
  if (s != 0) {
		printf("Error cleaning up thread attributes\n");
		return -5;
	}

	// Wait for all the threads to finish
	for (tnum=0; tnum<num_threads; ++tnum) {
		s = pthread_join(threads[tnum], (void **)(&res));

		if (s != 0) {
			printf("Error waiting for thread\n");
			return -6;
		}

		// Save the time
		times[tnum] = *res;

		// And clean it up
		free(res);
	}

	// Calculate the min, max, and average times
	float min_time = FLT_MAX;
	float max_time = -FLT_MAX;
	float avg_time = 0;
	for (tnum=0; tnum<num_threads; ++tnum) {
		if (times[tnum] < min_time)
			min_time = times[tnum];
		if (times[tnum] > max_time)
			max_time = times[tnum];
		avg_time += (times[tnum] - avg_time) / (float)(tnum + 1);
	}
	// Calculate the standard deviation of the time
	float stddev_time = 0;
	if (num_threads > 1) {
		for (tnum=0; tnum<num_threads; ++tnum)
			stddev_time += pow(times[tnum] - avg_time, 2);
		stddev_time = sqrtf(stddev_time / (num_threads - 1));
	}

	// Print out the statistics of the times
	printf("time_per_iteration: min: %.1f us avg: %.1f us max: %.1f us stddev: %.1f us\n",
			min_time / tinfo.num_iterations,
			avg_time / tinfo.num_iterations,
			max_time / tinfo.num_iterations,
			stddev_time / tinfo.num_iterations);

	// Clean up the allocated threads
	free(threads);

	return 0;
}

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: High CPU usage of scheduler?
  2012-04-26 22:08 High CPU usage of scheduler? Dave Johansen
@ 2012-04-27  3:10 ` Yong Zhang
  2012-04-27 15:23   ` Dave Johansen
  0 siblings, 1 reply; 9+ messages in thread
From: Yong Zhang @ 2012-04-27  3:10 UTC (permalink / raw)
  To: Dave Johansen; +Cc: linux-kernel

On Thu, Apr 26, 2012 at 03:08:51PM -0700, Dave Johansen wrote:
> I am looking into moving an application from RHEL 5 to RHEL 6 and I
> noticed an unexpected increase in CPU usage. A little digging has led
> me to believe that the scheduler may be the culprit.
> 
> I created the attached test_select_work.c file to test this out. I
> compiled it with the following command on RHEL 5:
> 
> cc test_select_work.c -O2 -DSLEEP_TYPE=0 -Wall -Wextra -lm -lpthread
> -o test_select_work

Hmm...Do both RHEL 5 and RHEL 6 have high resolution timer enabled?

If not, could you please try to boot the one which enable high resolution
timer with 'highres=off' to see if things change?

Thanks,
Yong

> 
> I then played with the parameters until the execution time per
> iteration was about 1 ms on a Dell Precision m6500.
> 
> I got the following result on RHEL 5:
> 
>     ./test_select_work 1000 10000 300 4
>     time_per_iteration: min: 911.5 us avg: 913.7 us max: 917.1 us stddev: 2.4 us
>     ./test_select_work 1000 10000 300 8
>     time_per_iteration: min: 1802.6 us avg: 1803.9 us max: 1809.1 us
> stddev: 2.1 us
>     ./test_select_work 1000 10000 300 40
>     time_per_iteration: min: 7580.4 us avg: 8567.3 us max: 9022.0 us
> stddev: 299.6 us
> 
> And the following on RHEL 6:
> 
>     ./test_select_work 1000 10000 300 4
>     time_per_iteration: min: 914.6 us avg: 975.7 us max: 1034.5 us
> stddev: 50.0 us
>     ./test_select_work 1000 10000 300 8
>     time_per_iteration: min: 1683.9 us avg: 1771.8 us max: 1810.8 us
> stddev: 43.4 us
>     ./test_select_work 1000 10000 300 40
>     time_per_iteration: min: 7997.1 us avg: 8709.1 us max: 9061.8 us
> stddev: 310.0 us
> 
> On both versions, these results were about what I expected with the
> average amount of time per iteration scaling relatively linearly. I
> then recompiled with -DSLEEP_TYPE=1 and got the following results on
> RHEL 5:
> 
>     ./test_select_work 1000 10000 300 4
>     time_per_iteration: min: 1803.3 us avg: 1902.8 us max: 2001.5 us
> stddev: 113.8 us
>     ./test_select_work 1000 10000 300 8
>     time_per_iteration: min: 1997.1 us avg: 2002.0 us max: 2010.8 us
> stddev: 5.0 us
>     ./test_select_work 1000 10000 300 40
>     time_per_iteration: min: 6958.4 us avg: 8397.9 us max: 9423.7 us
> stddev: 619.7 us
> 
> And the following results on RHEL 6:
> 
>     ./test_select_work 1000 10000 300 4
>     time_per_iteration: min: 2107.1 us avg: 2143.1 us max: 2177.7 us
> stddev: 30.3 us
>     ./test_select_work 1000 10000 300 8
>     time_per_iteration: min: 2903.3 us avg: 2903.8 us max: 2904.3 us
> stddev: 0.3 us
>     ./test_select_work 1000 10000 300 40
>     time_per_iteration: min: 8877.7.1 us avg: 9016.3 us max: 9112.6 us
> stddev: 62.9 us
> 
> On RHEL 5, the results were about what I expected (4 threads taking
> twice as long because of the 1 ms sleep but the 8 threads taking the
> same amount of time since each thread is now sleeping for about half
> the time, and a still fairly linear increase with the 40 threads).
> 
> However, with RHEL 6, the time taken with 4 threads increased by about
> 15% more than the expected doubling and the 8 thread case increased by
> about 45% more than the expected slight increase. The increase in the
> 4 thread case seems to be that RHEL 6 is actually sleeping for a
> handful of microseconds more than 1 ms while RHEL 5 is only sleep
> about 900 us, but this doesn't explain the unexpectedly large increase
> in the 8 and 40 thread cases.
> 
> I saw similar types of behaviour with all 3 -DSLEEP_TYPE values. I
> also tried playing with the scheduler parameters in sysctl, but
> nothing seemed to have a significant impact on the results. Any ideas
> on how I can further diagnose this issue?
> 
> Thanks,
> Dave

> #include <float.h>
> #include <math.h>
> #include <poll.h>
> #include <pthread.h>
> #include <stdio.h>
> #include <stdlib.h>
> #include <unistd.h>
> #include <sys/select.h>
> #include <sys/time.h>
> 
> 
> // Uncomment to use select inside the loop of each of the threads
> #define SLEEP_TYPE 1
> // Uncomment to use poll inside the loop of each of the threads
> //#define SLEEP_TYPE 2
> // Uncomment to use usleep inside the loop of each of the threads
> //#define SLEEP_TYPE 3
> 
> 
> struct thread_info {
> 	int sleep_time;
> 	int num_iterations;
> 	int work_size;
> };
> 
> void *do_test(void *arg)
> {
> 	const struct thread_info *tinfo = (struct thread_info *)arg;
> 
> 	const int sleep_time = tinfo->sleep_time;
> 	const int num_iterations = tinfo->num_iterations;
> 	const int work_size = tinfo->work_size;
> 
> #if SLEEP_TYPE == 1
> 	// Data for calling select
> 	struct timeval ts;
> #elif SLEEP_TYPE == 2
> 	// Data for calling poll
> 	struct pollfd pfd;
> #endif
> 	// Data for doing work
> 	int *buf;
> 	int pseed;
> 	int inum, bnum;
> 	// Data for tracking the time
>   struct timeval before, after;
> 	long long *diff;
>  
> 	buf = calloc(work_size, sizeof(int));
> 	diff = malloc(sizeof(unsigned long long));
> 
> #if SLEEP_TYPE == 2
> 	// Initialize the poll data
> 	pfd.fd = 0;
> 	pfd.events = 0;
> #endif
> 
> 	// Get the time before starting the processing
>   gettimeofday(&before, NULL);
> 
> 	// Do the requested number of iterations
> 	for (inum=0; inum<num_iterations; ++inum) {
> #if SLEEP_TYPE == 1
> 		ts.tv_sec = 0;
> 		ts.tv_usec = sleep_time;
> 		select(0, 0, 0, 0, &ts);
> #elif SLEEP_TYPE == 2
> 		poll(&pfd, 1, sleep_time / 1000);
> #elif SLEEP_TYPE == 3
> 		usleep(sleep_time);
> #else
> 		// Get rid of warning about unused variable
> 		(void)sleep_time;
> #endif
> 
> 		// Fill in a buffer with random numbers (taken from latt.c by Jens Axboe <jens.axboe@oracle.com>)
> 		pseed = 1;
> 		for (bnum=0; bnum<work_size; ++bnum) {
> 			pseed = pseed * 1103515245 + 12345;
> 			buf[bnum] = (pseed / 65536) % 32768;
> 		}
> 	}
> 
> 	// Get the time after starting the processing
>   gettimeofday(&after, NULL);
> 
> 	// Calculate the delta time
>   *diff = 1000000LL * (after.tv_sec - before.tv_sec);
>   *diff += after.tv_usec - before.tv_usec;
> 
> 	// Clean up the data
> 	free(buf);
> 
> 	return diff;
> }
> 
> int main(int argc, char **argv)
> {
> 	if (argc < 4) {
> 		printf("Usage: %s <sleep_time> <num_iterations> <work_size> <num_threads>\n", argv[0]);
> 		return -1;
> 	}
> 
> 	struct thread_info tinfo;
> 	int s, tnum, num_threads;
> 	pthread_attr_t attr;
> 	pthread_t *threads;
> 	long long *res;
> 	long long *times;
> 
> 	// Get the parameters
> 	tinfo.sleep_time = atoi(argv[1]);
> 	tinfo.num_iterations = atoi(argv[2]);
> 	tinfo.work_size = atoi(argv[3]) * 1024;
> 	num_threads = atoi(argv[4]);
> 
> 	// Initialize the thread creation attributes
> 	s = pthread_attr_init(&attr);
>   if (s != 0) {
> 		printf("Error initializing thread attributes\n");
> 		return -2;
> 	}
> 
> 	// Allocate the memory to track the threads
>  	threads = calloc(num_threads, sizeof(pthread_t));
>  	times = calloc(num_threads, sizeof(unsigned long long));
> 	if (threads == NULL) {
> 		printf("Error allocating memory to track threads\n");
> 		return -3;
> 	}
> 
> 	// Start all of the threads
> 	for (tnum=0; tnum<num_threads; ++tnum) {
> 		s = pthread_create(&threads[tnum], &attr, &do_test, &tinfo);
> 
> 		if (s != 0) {
> 			printf("Error starting thread\n");
> 			return -4;
> 		}
> 	}
> 
> 	// Clean up the thread creation attributes
> 	s = pthread_attr_destroy(&attr);
>   if (s != 0) {
> 		printf("Error cleaning up thread attributes\n");
> 		return -5;
> 	}
> 
> 	// Wait for all the threads to finish
> 	for (tnum=0; tnum<num_threads; ++tnum) {
> 		s = pthread_join(threads[tnum], (void **)(&res));
> 
> 		if (s != 0) {
> 			printf("Error waiting for thread\n");
> 			return -6;
> 		}
> 
> 		// Save the time
> 		times[tnum] = *res;
> 
> 		// And clean it up
> 		free(res);
> 	}
> 
> 	// Calculate the min, max, and average times
> 	float min_time = FLT_MAX;
> 	float max_time = -FLT_MAX;
> 	float avg_time = 0;
> 	for (tnum=0; tnum<num_threads; ++tnum) {
> 		if (times[tnum] < min_time)
> 			min_time = times[tnum];
> 		if (times[tnum] > max_time)
> 			max_time = times[tnum];
> 		avg_time += (times[tnum] - avg_time) / (float)(tnum + 1);
> 	}
> 	// Calculate the standard deviation of the time
> 	float stddev_time = 0;
> 	if (num_threads > 1) {
> 		for (tnum=0; tnum<num_threads; ++tnum)
> 			stddev_time += pow(times[tnum] - avg_time, 2);
> 		stddev_time = sqrtf(stddev_time / (num_threads - 1));
> 	}
> 
> 	// Print out the statistics of the times
> 	printf("time_per_iteration: min: %.1f us avg: %.1f us max: %.1f us stddev: %.1f us\n",
> 			min_time / tinfo.num_iterations,
> 			avg_time / tinfo.num_iterations,
> 			max_time / tinfo.num_iterations,
> 			stddev_time / tinfo.num_iterations);
> 
> 	// Clean up the allocated threads
> 	free(threads);
> 
> 	return 0;
> }


-- 
Only stand for myself

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: High CPU usage of scheduler?
  2012-04-27  3:10 ` Yong Zhang
@ 2012-04-27 15:23   ` Dave Johansen
  2012-04-30 17:29     ` Dave Johansen
                       ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Dave Johansen @ 2012-04-27 15:23 UTC (permalink / raw)
  To: Yong Zhang; +Cc: linux-kernel

On Thu, Apr 26, 2012 at 8:10 PM, Yong Zhang <yong.zhang0@gmail.com> wrote:
>
> On Thu, Apr 26, 2012 at 03:08:51PM -0700, Dave Johansen wrote:
> > I am looking into moving an application from RHEL 5 to RHEL 6 and I
> > noticed an unexpected increase in CPU usage. A little digging has led
> > me to believe that the scheduler may be the culprit.
> >
> > I created the attached test_select_work.c file to test this out. I
> > compiled it with the following command on RHEL 5:
> >
> > cc test_select_work.c -O2 -DSLEEP_TYPE=0 -Wall -Wextra -lm -lpthread
> > -o test_select_work
>
> Hmm...Do both RHEL 5 and RHEL 6 have high resolution timer enabled?
>
> If not, could you please try to boot the one which enable high resolution
> timer with 'highres=off' to see if things change?

Yes, RHEL 6 has CONFIG_HIGH_RES_TIMERS=y. I rebooted and used the
'highres=off' in grub and got the following results:

   ./test_select_work 1000 10000 300 4
   time_per_iteration: min: 3130.1 us avg: 3152.2 us max: 3162.2 us
stddev: 15.0 us
   ./test_select_work 1000 10000 300 8
   time_per_iteration: min: 4314.6 us avg: 4407.9 us max: 4496.3 us
stddev: 60.6 us
   ./test_select_work 1000 10000 300 40
   time_per_iteration: min: 8901.7 us avg: 9056.5 us max: 9121.3 us
stddev: 57.5 us

Any other info that might be helpful?

Thanks,
Dave

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: High CPU usage of scheduler?
  2012-04-27 15:23   ` Dave Johansen
@ 2012-04-30 17:29     ` Dave Johansen
  2012-04-30 18:50     ` Dave Johansen
  2012-05-01  5:11     ` Dave Johansen
  2 siblings, 0 replies; 9+ messages in thread
From: Dave Johansen @ 2012-04-30 17:29 UTC (permalink / raw)
  To: Yong Zhang; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 4840 bytes --]

On Fri, Apr 27, 2012 at 8:23 AM, Dave Johansen <davejohansen@gmail.com> wrote:
> On Thu, Apr 26, 2012 at 8:10 PM, Yong Zhang <yong.zhang0@gmail.com> wrote:
>>
>> On Thu, Apr 26, 2012 at 03:08:51PM -0700, Dave Johansen wrote:
>> > I am looking into moving an application from RHEL 5 to RHEL 6 and I
>> > noticed an unexpected increase in CPU usage. A little digging has led
>> > me to believe that the scheduler may be the culprit.
>> >
>> > I created the attached test_select_work.c file to test this out. I
>> > compiled it with the following command on RHEL 5:
>> >
>> > cc test_select_work.c -O2 -DSLEEP_TYPE=0 -Wall -Wextra -lm -lpthread
>> > -o test_select_work
>>
>> Hmm...Do both RHEL 5 and RHEL 6 have high resolution timer enabled?
>>
>> If not, could you please try to boot the one which enable high resolution
>> timer with 'highres=off' to see if things change?
>
> Yes, RHEL 6 has CONFIG_HIGH_RES_TIMERS=y. I rebooted and used the
> 'highres=off' in grub and got the following results:
>
>   ./test_select_work 1000 10000 300 4
>   time_per_iteration: min: 3130.1 us avg: 3152.2 us max: 3162.2 us
> stddev: 15.0 us
>   ./test_select_work 1000 10000 300 8
>   time_per_iteration: min: 4314.6 us avg: 4407.9 us max: 4496.3 us
> stddev: 60.6 us
>   ./test_select_work 1000 10000 300 40
>   time_per_iteration: min: 8901.7 us avg: 9056.5 us max: 9121.3 us
> stddev: 57.5 us
>
> Any other info that might be helpful?
>
> Thanks,
> Dave

I made some improvements to the program to make comparisons a bit
easier and the standard deviation a bit more meaningful. I have
attached the updated program and script I used to run it. I also made
a git repo to track the code and it's available at
git://github.com/daveisfera/test_sleep.git

I also ran the tests on a Dell Vostro 200 (dual core CPU) with several
different OS versions. I realize that several of these will have
different patches applied and won't be "pure kernel code", but this
was the simplest way that I could run the tests on different versions
of the kernel and get comparisons. I've attached the .svg files output
by gnuplot and the data that was used to generate the plots. I
apologize for having so many attachments, but I figured that this was
the best way to convey the information.

All of these tests were run with the following command with the
executable that was built on Ubuntu 11.10 (except for CentOS 6.2 which
had gcc available on it so I re-compiled it):
./run_test 1000 10 1000 250 8 4

The first interesting results is with Ubuntu 7.10. usleep seems to
behave as expected, but select and poll took significantly longer than
expected. Then with CentOS 5.6, the opposite is true, so I'm not sure
what result can be drawn from those conclusions.

I tried running the Ubuntu 8.04-4 LiveCD (because that's when the CFS
was introduced), but it wouldn't boot on the Dell Vostro 200.

CentOS 6.2 is where things start to get interesting, because a 10-15%
increase in the execution time is seen with select and poll. With
usleep, that penalty doesn't seem to be present for low numbers of
threads but seems to reach similar levels once the number of threads
is 2x the number of cores.

Ubuntu 11.10, Ubuntu 12.04, and Fedora 16 all show results basically
in line with CentOS 6.2 but with usleep always being a comparable
penalty the other sleep methods. The one exception to this is that on
Fedora 16, poll is lower than the other sleep methods with 3 threads.

One final test that I did was to install the PPA on Ubuntu 11.10 that
has BFS instead of CFS. Info about it can be found at:
https://launchpad.net/~chogydan/+archive/ppa
It doesn't seem to be suffering from this sleep issue (or at least in
a different way) because the results for 1-3 threads are at 2 ms as
expected (basically no penalty), but then at 4 threads it's about 20%
higher than the no sleep methods and it seems to maintain more of a
penalty at higher numbers of threads. I also didn't see the system CPU
usage when running the previous test with BFS (but this might just be
an accounting thing since CFS and BFS account for CPU usage
differently).

I know that these results aren't 100% conclusive, but they seem to
indicate to me that something about the sleeping mechanism in the
kernel is using more CPU than I expect it to and that this is the
cause of the increased CPU usage. It doesn't appear to be the
scheduler itself because then a similar sort of penalty would have
been seen with sched_yield. It appears to me that it is something to
do with the sleep mechanism's interaction with the scheduler
(particularly the CFS more than the BFS).

Is there any more data I can gather or tests that I can run that can
help diagnose this problem?

Thanks in advance for any help,
Dave

[-- Attachment #2: test_sleep.c --]
[-- Type: text/x-csrc, Size: 7372 bytes --]

#include <float.h>
#include <math.h>
#include <poll.h>
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/select.h>
#include <sys/time.h>



// The different type of sleep that are supported
enum sleep_type {
  SLEEP_TYPE_NONE,
  SLEEP_TYPE_SELECT,
  SLEEP_TYPE_POLL,
  SLEEP_TYPE_USLEEP,
  SLEEP_TYPE_YIELD,
};

// Function type for doing work with a sleep
typedef long long *(*work_func)(const int sleep_time, const int num_iterations, const int work_size);

// Information passed to the thread
struct thread_info {
  int sleep_time;
  int num_iterations;
  int work_size;
  work_func func;
};

// In order to make SLEEP_TYPE a run-time parameter function pointers are used.
// The function pointer could have been to the sleep function being used, but
// then that would mean an extra function call inside of the "work loop" and I
// wanted to keep the measurements as tight as possible and the extra work being
// done to be as small/controlled as possible so instead the work is declared as
// a seriees of macros that are called in all of the sleep functions. The code
// is a bit uglier this way, but I believe it results in a more accurate test.

// Fill in a buffer with random numbers (taken from latt.c by Jens Axboe <jens.axboe@oracle.com>)
#define DECLARE_WORK() \
  int *buf; \
  int pseed; \
  int inum, bnum; \
  struct timeval before, after; \
  long long *diff; \
  buf = calloc(work_size, sizeof(int)); \
  diff = malloc(sizeof(long long)); \
  gettimeofday(&before, NULL)
  
#define DO_WORK(SLEEP_FUNC) \
  for (inum=0; inum<num_iterations; ++inum) { \
    SLEEP_FUNC \
     \
    pseed = 1; \
    for (bnum=0; bnum<work_size; ++bnum) { \
      pseed = pseed * 1103515245 + 12345; \
      buf[bnum] = (pseed / 65536) % 32768; \
    } \
  } \

#define FINISH_WORK() \
  gettimeofday(&after, NULL); \
  *diff = 1000000LL * (after.tv_sec - before.tv_sec); \
  *diff += after.tv_usec - before.tv_usec; \
  free(buf); \
  return diff

long long *do_work_nosleep(const int sleep_time, const int num_iterations, const int work_size)
{
  DECLARE_WORK();

  // Let the compiler know that sleep_time isn't used in this function
  (void)sleep_time;

  DO_WORK();

  FINISH_WORK();
}

long long *do_work_select(const int sleep_time, const int num_iterations, const int work_size)
{
  struct timeval ts;
  DECLARE_WORK();

  DO_WORK(
    ts.tv_sec = 0;
    ts.tv_usec = sleep_time;
    select(0, 0, 0, 0, &ts);
    );

  FINISH_WORK();
}

long long *do_work_poll(const int sleep_time, const int num_iterations, const int work_size)
{
  struct pollfd pfd;
  const int sleep_time_ms = sleep_time / 1000;
  DECLARE_WORK();

  pfd.fd = 0;
  pfd.events = 0;

  DO_WORK(
    poll(&pfd, 1, sleep_time_ms);
    );

  FINISH_WORK();
}

long long *do_work_usleep(const int sleep_time, const int num_iterations, const int work_size)
{
  DECLARE_WORK();

  DO_WORK(
    usleep(sleep_time);
    );

  FINISH_WORK();
}

long long *do_work_yield(const int sleep_time, const int num_iterations, const int work_size)
{
  DECLARE_WORK();

  // Let the compiler know that sleep_time isn't used in this function
  (void)sleep_time;

  DO_WORK(
    sched_yield();
    );

  FINISH_WORK();
}

void *do_test(void *arg)
{
  const struct thread_info *tinfo = (struct thread_info *)arg;

  // Call the function to do the work
  return (*tinfo->func)(tinfo->sleep_time, tinfo->num_iterations, tinfo->work_size);
}

int main(int argc, char **argv)
{
  if (argc <= 6) {
    printf("Usage: %s <sleep_time> <outer_iterations> <inner_iterations> <work_size> <num_threads> <sleep_type>\n", argv[0]);
    printf("  outer_iterations: Number of iterations for each thread (used to calculate statistics)\n");
    printf("  inner_iterations: Number of work/sleep cycles performed in each thread (used to improve consistency/observability))\n");
    printf("  work_size: Number of array elements (in kb) that are filled with psuedo-random numbers\n");
    printf("  num_threads: Number of threads to spawn and perform work/sleep cycles in\n");
    printf("  sleep_type: 0=none 1=select 2=poll 3=usleep 4=yield\n");
    return -1;
  }

  struct thread_info tinfo;
  int outer_iterations;
  int sleep_type;
  int s, inum, tnum, num_threads;
  pthread_attr_t attr;
  pthread_t *threads;
  long long *res;
  long long *times;

  // Get the parameters
  tinfo.sleep_time = atoi(argv[1]);
  outer_iterations = atoi(argv[2]);
  tinfo.num_iterations = atoi(argv[3]);
  tinfo.work_size = atoi(argv[4]) * 1024;
  num_threads = atoi(argv[5]);
  sleep_type = atoi(argv[6]);
  switch (sleep_type) {
    case SLEEP_TYPE_NONE:   tinfo.func = &do_work_nosleep; break;
    case SLEEP_TYPE_SELECT: tinfo.func = &do_work_select;  break;
    case SLEEP_TYPE_POLL:   tinfo.func = &do_work_poll;    break;
    case SLEEP_TYPE_USLEEP: tinfo.func = &do_work_usleep;  break;
    case SLEEP_TYPE_YIELD:  tinfo.func = &do_work_yield;   break;
    default:
      printf("Invalid sleep type: %d\n", sleep_type);
      return -7;
  }

  // Initialize the thread creation attributes
  s = pthread_attr_init(&attr);
  if (s != 0) {
    printf("Error initializing thread attributes\n");
    return -2;
  }

  // Allocate the memory to track the threads
  threads = calloc(num_threads, sizeof(pthread_t));
  times = calloc(num_threads, sizeof(unsigned long long));
  if (threads == NULL) {
    printf("Error allocating memory to track threads\n");
    return -3;
  }

  // Calculate the statistics of the processing
  float min_time = FLT_MAX;
  float max_time = -FLT_MAX;
  float avg_time = 0;
  float prev_avg_time = 0;
  float stddev_time = 0;

  // Perform the requested number of outer iterations
  for (inum=0; inum<outer_iterations; ++inum) {
    // Start all of the threads
    for (tnum=0; tnum<num_threads; ++tnum) {
      s = pthread_create(&threads[tnum], &attr, &do_test, &tinfo);

      if (s != 0) {
        printf("Error starting thread\n");
        return -4;
      }
    }

    // Clean up the thread creation attributes
    s = pthread_attr_destroy(&attr);
    if (s != 0) {
      printf("Error cleaning up thread attributes\n");
      return -5;
    }

    // Wait for all the threads to finish
    for (tnum=0; tnum<num_threads; ++tnum) {
      s = pthread_join(threads[tnum], (void **)(&res));

      if (s != 0) {
        printf("Error waiting for thread\n");
        return -6;
      }

      // Save the time
      times[tnum] = *res;

      // And clean it up
      free(res);
    }

    // Update the statistics
    for (tnum=0; tnum<num_threads; ++tnum) {
      if (times[tnum] < min_time)
        min_time = times[tnum];
      if (times[tnum] > max_time)
        max_time = times[tnum];
      avg_time += (times[tnum] - avg_time) / (float)(tnum + 1);
      stddev_time += (times[tnum] - prev_avg_time) * (times[tnum] - avg_time);
      prev_avg_time = avg_time;
    }
  }

  // Finish the calculation of the standard deviation
  stddev_time = sqrtf(stddev_time / ((outer_iterations * num_threads) - 1));

  // Print out the statistics of the times
  printf("time_per_iteration: min: %.1f us avg: %.1f us max: %.1f us stddev: %.1f us\n",
      min_time / tinfo.num_iterations,
      avg_time / tinfo.num_iterations,
      max_time / tinfo.num_iterations,
      stddev_time / tinfo.num_iterations);

  // Clean up the allocated threads
  free(threads);

  return 0;
}

[-- Attachment #3: centos_5.6.svg --]
[-- Type: image/svg+xml, Size: 16791 bytes --]

[-- Attachment #4: centos_6.2.svg --]
[-- Type: image/svg+xml, Size: 16776 bytes --]

[-- Attachment #5: fedora_16.svg --]
[-- Type: image/svg+xml, Size: 16536 bytes --]

[-- Attachment #6: results.tgz --]
[-- Type: application/x-gzip, Size: 9128 bytes --]

[-- Attachment #7: ubuntu_7.10.svg --]
[-- Type: image/svg+xml, Size: 16776 bytes --]

[-- Attachment #8: ubuntu_11.10.svg --]
[-- Type: image/svg+xml, Size: 16556 bytes --]

[-- Attachment #9: ubuntu_11.10_bfs.svg --]
[-- Type: image/svg+xml, Size: 16564 bytes --]

[-- Attachment #10: ubuntu_12.04.svg --]
[-- Type: image/svg+xml, Size: 16539 bytes --]

[-- Attachment #11: run_test --]
[-- Type: application/octet-stream, Size: 746 bytes --]

#!/bin/bash

if [ $# -ne 6 ]; then
	echo "Usage: `basename $0` <sleep_time> <outer_iterations> <inner_iterations> <work_size> <max_num_threads> <max_sleep_type>"
	echo "  max_num_threads: The highest value used for num_threads in the results"
	echo "  max_sleep_type: The highest value used for sleep_type in the results"
	exit -1
fi

# Run through the requested number of SLEEP_TYPE values
for i in $(seq 0 $6)
do
	# Remove any existing results file
	rm -f "results_$i.txt"
	# Run through the requested number of threads
	for j in $(seq 1 $5)
	do
		# Print which settings are about to be run
		echo "sleep_type: $i num_threads: $j"
		# Run the test and save it to the results file
		./test_sleep $1 $2 $3 $4 $j $i >> "results_$i.txt"
	done
done

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: High CPU usage of scheduler?
  2012-04-27 15:23   ` Dave Johansen
  2012-04-30 17:29     ` Dave Johansen
@ 2012-04-30 18:50     ` Dave Johansen
  2012-05-01  5:11     ` Dave Johansen
  2 siblings, 0 replies; 9+ messages in thread
From: Dave Johansen @ 2012-04-30 18:50 UTC (permalink / raw)
  To: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 4746 bytes --]

On Fri, Apr 27, 2012 at 8:23 AM, Dave Johansen <davejohansen@gmail.com> wrote:
> On Thu, Apr 26, 2012 at 8:10 PM, Yong Zhang <yong.zhang0@gmail.com> wrote:
>>
>> On Thu, Apr 26, 2012 at 03:08:51PM -0700, Dave Johansen wrote:
>> > I am looking into moving an application from RHEL 5 to RHEL 6 and I
>> > noticed an unexpected increase in CPU usage. A little digging has led
>> > me to believe that the scheduler may be the culprit.
>> >
>> > I created the attached test_select_work.c file to test this out. I
>> > compiled it with the following command on RHEL 5:
>> >
>> > cc test_select_work.c -O2 -DSLEEP_TYPE=0 -Wall -Wextra -lm -lpthread
>> > -o test_select_work
>>
>> Hmm...Do both RHEL 5 and RHEL 6 have high resolution timer enabled?
>>
>> If not, could you please try to boot the one which enable high resolution
>> timer with 'highres=off' to see if things change?
>
> Yes, RHEL 6 has CONFIG_HIGH_RES_TIMERS=y. I rebooted and used the
> 'highres=off' in grub and got the following results:
>
>   ./test_select_work 1000 10000 300 4
>   time_per_iteration: min: 3130.1 us avg: 3152.2 us max: 3162.2 us
> stddev: 15.0 us
>   ./test_select_work 1000 10000 300 8
>   time_per_iteration: min: 4314.6 us avg: 4407.9 us max: 4496.3 us
> stddev: 60.6 us
>   ./test_select_work 1000 10000 300 40
>   time_per_iteration: min: 8901.7 us avg: 9056.5 us max: 9121.3 us
> stddev: 57.5 us
>
> Any other info that might be helpful?
>
> Thanks,
> Dave

I made some improvements to the program to make comparisons a bit
easier and the standard deviation a bit more meaningful. I have
attached the updated program and script I used to run it. I also made
a git repo to track the code and it's available at:
git://github.com/daveisfera/test_sleep.git

I also ran the tests on a Dell Vostro 200 (dual core CPU) with several
different OS versions. I realize that several of these will have
different patches applied and won't be "pure kernel code", but this
was the simplest way that I could run the tests on different versions
of the kernel and get comparisons. The plots are available in the
bugzilla I opened on this issue:
https://bugzilla.redhat.com/show_bug.cgi?id=812148

All of these tests were run with the following command with the
executable that was built on Ubuntu 11.10 (except for CentOS 6.2 which
had gcc available on it so I re-compiled it):
./run_test 1000 10 1000 250 8 4

The first interesting results is with Ubuntu 7.10. usleep seems to
behave as expected, but select and poll took significantly longer than
expected. Then with CentOS 5.6, the opposite is true, so I'm not sure
what result can be drawn from those conclusions.

I tried running the Ubuntu 8.04-4 LiveCD (because that's when the CFS
was introduced), but it wouldn't boot on the Dell Vostro 200.

CentOS 6.2 is where things start to get interesting, because a 10-15%
increase in the execution time is seen with select and poll. With
usleep, that penalty doesn't seem to be present for low numbers of
threads but seems to reach similar levels once the number of threads
is 2x the number of cores.

Ubuntu 11.10, Ubuntu 12.04, and Fedora 16 all show results basically
in line with CentOS 6.2 but with usleep always being a comparable
penalty the other sleep methods. The one exception to this is that on
Fedora 16, poll is lower than the other sleep methods with 3 threads.

One final test that I did was to install the PPA on Ubuntu 11.10 that
has BFS instead of CFS. Info about it can be found at:
https://launchpad.net/~chogydan/+archive/ppa
It doesn't seem to be suffering from this sleep issue (or at least in
a different way) because the results for 1-3 threads are at 2 ms as
expected (basically no penalty), but then at 4 threads it's about 20%
higher than the no sleep methods and it seems to maintain more of a
penalty at higher numbers of threads. I also didn't see the system CPU
usage when running the previous test with BFS (but this might just be
an accounting thing since CFS and BFS account for CPU usage
differently).

I know that these results aren't 100% conclusive, but they seem to
indicate to me that something about the sleeping mechanism in the
kernel is using more CPU than I expect it to and that this is the
cause of the increased CPU usage. It doesn't appear to be the
scheduler itself because then a similar sort of penalty would have
been seen with sched_yield. It appears to me that it is something to
do with the sleep mechanism's interaction with the scheduler
(particularly the CFS more than the BFS).

Is there any more data I can gather or tests that I can run that can
help diagnose this problem?

Thanks in advance for any help,
Dave

[-- Attachment #2: run_test --]
[-- Type: application/octet-stream, Size: 746 bytes --]

#!/bin/bash

if [ $# -ne 6 ]; then
	echo "Usage: `basename $0` <sleep_time> <outer_iterations> <inner_iterations> <work_size> <max_num_threads> <max_sleep_type>"
	echo "  max_num_threads: The highest value used for num_threads in the results"
	echo "  max_sleep_type: The highest value used for sleep_type in the results"
	exit -1
fi

# Run through the requested number of SLEEP_TYPE values
for i in $(seq 0 $6)
do
	# Remove any existing results file
	rm -f "results_$i.txt"
	# Run through the requested number of threads
	for j in $(seq 1 $5)
	do
		# Print which settings are about to be run
		echo "sleep_type: $i num_threads: $j"
		# Run the test and save it to the results file
		./test_sleep $1 $2 $3 $4 $j $i >> "results_$i.txt"
	done
done

[-- Attachment #3: test_sleep.c --]
[-- Type: text/x-csrc, Size: 7372 bytes --]

#include <float.h>
#include <math.h>
#include <poll.h>
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/select.h>
#include <sys/time.h>



// The different type of sleep that are supported
enum sleep_type {
  SLEEP_TYPE_NONE,
  SLEEP_TYPE_SELECT,
  SLEEP_TYPE_POLL,
  SLEEP_TYPE_USLEEP,
  SLEEP_TYPE_YIELD,
};

// Function type for doing work with a sleep
typedef long long *(*work_func)(const int sleep_time, const int num_iterations, const int work_size);

// Information passed to the thread
struct thread_info {
  int sleep_time;
  int num_iterations;
  int work_size;
  work_func func;
};

// In order to make SLEEP_TYPE a run-time parameter function pointers are used.
// The function pointer could have been to the sleep function being used, but
// then that would mean an extra function call inside of the "work loop" and I
// wanted to keep the measurements as tight as possible and the extra work being
// done to be as small/controlled as possible so instead the work is declared as
// a seriees of macros that are called in all of the sleep functions. The code
// is a bit uglier this way, but I believe it results in a more accurate test.

// Fill in a buffer with random numbers (taken from latt.c by Jens Axboe <jens.axboe@oracle.com>)
#define DECLARE_WORK() \
  int *buf; \
  int pseed; \
  int inum, bnum; \
  struct timeval before, after; \
  long long *diff; \
  buf = calloc(work_size, sizeof(int)); \
  diff = malloc(sizeof(long long)); \
  gettimeofday(&before, NULL)
  
#define DO_WORK(SLEEP_FUNC) \
  for (inum=0; inum<num_iterations; ++inum) { \
    SLEEP_FUNC \
     \
    pseed = 1; \
    for (bnum=0; bnum<work_size; ++bnum) { \
      pseed = pseed * 1103515245 + 12345; \
      buf[bnum] = (pseed / 65536) % 32768; \
    } \
  } \

#define FINISH_WORK() \
  gettimeofday(&after, NULL); \
  *diff = 1000000LL * (after.tv_sec - before.tv_sec); \
  *diff += after.tv_usec - before.tv_usec; \
  free(buf); \
  return diff

long long *do_work_nosleep(const int sleep_time, const int num_iterations, const int work_size)
{
  DECLARE_WORK();

  // Let the compiler know that sleep_time isn't used in this function
  (void)sleep_time;

  DO_WORK();

  FINISH_WORK();
}

long long *do_work_select(const int sleep_time, const int num_iterations, const int work_size)
{
  struct timeval ts;
  DECLARE_WORK();

  DO_WORK(
    ts.tv_sec = 0;
    ts.tv_usec = sleep_time;
    select(0, 0, 0, 0, &ts);
    );

  FINISH_WORK();
}

long long *do_work_poll(const int sleep_time, const int num_iterations, const int work_size)
{
  struct pollfd pfd;
  const int sleep_time_ms = sleep_time / 1000;
  DECLARE_WORK();

  pfd.fd = 0;
  pfd.events = 0;

  DO_WORK(
    poll(&pfd, 1, sleep_time_ms);
    );

  FINISH_WORK();
}

long long *do_work_usleep(const int sleep_time, const int num_iterations, const int work_size)
{
  DECLARE_WORK();

  DO_WORK(
    usleep(sleep_time);
    );

  FINISH_WORK();
}

long long *do_work_yield(const int sleep_time, const int num_iterations, const int work_size)
{
  DECLARE_WORK();

  // Let the compiler know that sleep_time isn't used in this function
  (void)sleep_time;

  DO_WORK(
    sched_yield();
    );

  FINISH_WORK();
}

void *do_test(void *arg)
{
  const struct thread_info *tinfo = (struct thread_info *)arg;

  // Call the function to do the work
  return (*tinfo->func)(tinfo->sleep_time, tinfo->num_iterations, tinfo->work_size);
}

int main(int argc, char **argv)
{
  if (argc <= 6) {
    printf("Usage: %s <sleep_time> <outer_iterations> <inner_iterations> <work_size> <num_threads> <sleep_type>\n", argv[0]);
    printf("  outer_iterations: Number of iterations for each thread (used to calculate statistics)\n");
    printf("  inner_iterations: Number of work/sleep cycles performed in each thread (used to improve consistency/observability))\n");
    printf("  work_size: Number of array elements (in kb) that are filled with psuedo-random numbers\n");
    printf("  num_threads: Number of threads to spawn and perform work/sleep cycles in\n");
    printf("  sleep_type: 0=none 1=select 2=poll 3=usleep 4=yield\n");
    return -1;
  }

  struct thread_info tinfo;
  int outer_iterations;
  int sleep_type;
  int s, inum, tnum, num_threads;
  pthread_attr_t attr;
  pthread_t *threads;
  long long *res;
  long long *times;

  // Get the parameters
  tinfo.sleep_time = atoi(argv[1]);
  outer_iterations = atoi(argv[2]);
  tinfo.num_iterations = atoi(argv[3]);
  tinfo.work_size = atoi(argv[4]) * 1024;
  num_threads = atoi(argv[5]);
  sleep_type = atoi(argv[6]);
  switch (sleep_type) {
    case SLEEP_TYPE_NONE:   tinfo.func = &do_work_nosleep; break;
    case SLEEP_TYPE_SELECT: tinfo.func = &do_work_select;  break;
    case SLEEP_TYPE_POLL:   tinfo.func = &do_work_poll;    break;
    case SLEEP_TYPE_USLEEP: tinfo.func = &do_work_usleep;  break;
    case SLEEP_TYPE_YIELD:  tinfo.func = &do_work_yield;   break;
    default:
      printf("Invalid sleep type: %d\n", sleep_type);
      return -7;
  }

  // Initialize the thread creation attributes
  s = pthread_attr_init(&attr);
  if (s != 0) {
    printf("Error initializing thread attributes\n");
    return -2;
  }

  // Allocate the memory to track the threads
  threads = calloc(num_threads, sizeof(pthread_t));
  times = calloc(num_threads, sizeof(unsigned long long));
  if (threads == NULL) {
    printf("Error allocating memory to track threads\n");
    return -3;
  }

  // Calculate the statistics of the processing
  float min_time = FLT_MAX;
  float max_time = -FLT_MAX;
  float avg_time = 0;
  float prev_avg_time = 0;
  float stddev_time = 0;

  // Perform the requested number of outer iterations
  for (inum=0; inum<outer_iterations; ++inum) {
    // Start all of the threads
    for (tnum=0; tnum<num_threads; ++tnum) {
      s = pthread_create(&threads[tnum], &attr, &do_test, &tinfo);

      if (s != 0) {
        printf("Error starting thread\n");
        return -4;
      }
    }

    // Clean up the thread creation attributes
    s = pthread_attr_destroy(&attr);
    if (s != 0) {
      printf("Error cleaning up thread attributes\n");
      return -5;
    }

    // Wait for all the threads to finish
    for (tnum=0; tnum<num_threads; ++tnum) {
      s = pthread_join(threads[tnum], (void **)(&res));

      if (s != 0) {
        printf("Error waiting for thread\n");
        return -6;
      }

      // Save the time
      times[tnum] = *res;

      // And clean it up
      free(res);
    }

    // Update the statistics
    for (tnum=0; tnum<num_threads; ++tnum) {
      if (times[tnum] < min_time)
        min_time = times[tnum];
      if (times[tnum] > max_time)
        max_time = times[tnum];
      avg_time += (times[tnum] - avg_time) / (float)(tnum + 1);
      stddev_time += (times[tnum] - prev_avg_time) * (times[tnum] - avg_time);
      prev_avg_time = avg_time;
    }
  }

  // Finish the calculation of the standard deviation
  stddev_time = sqrtf(stddev_time / ((outer_iterations * num_threads) - 1));

  // Print out the statistics of the times
  printf("time_per_iteration: min: %.1f us avg: %.1f us max: %.1f us stddev: %.1f us\n",
      min_time / tinfo.num_iterations,
      avg_time / tinfo.num_iterations,
      max_time / tinfo.num_iterations,
      stddev_time / tinfo.num_iterations);

  // Clean up the allocated threads
  free(threads);

  return 0;
}

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: High CPU usage of scheduler?
  2012-04-27 15:23   ` Dave Johansen
  2012-04-30 17:29     ` Dave Johansen
  2012-04-30 18:50     ` Dave Johansen
@ 2012-05-01  5:11     ` Dave Johansen
  2012-05-01 17:28       ` Dave Johansen
  2 siblings, 1 reply; 9+ messages in thread
From: Dave Johansen @ 2012-05-01  5:11 UTC (permalink / raw)
  To: linux-kernel

On Fri, Apr 27, 2012 at 8:23 AM, Dave Johansen <davejohansen@gmail.com> wrote:
> On Thu, Apr 26, 2012 at 8:10 PM, Yong Zhang <yong.zhang0@gmail.com> wrote:
>>
>> On Thu, Apr 26, 2012 at 03:08:51PM -0700, Dave Johansen wrote:
>> > I am looking into moving an application from RHEL 5 to RHEL 6 and I
>> > noticed an unexpected increase in CPU usage. A little digging has led
>> > me to believe that the scheduler may be the culprit.
>> >
>> > I created the attached test_select_work.c file to test this out. I
>> > compiled it with the following command on RHEL 5:
>> >
>> > cc test_select_work.c -O2 -DSLEEP_TYPE=0 -Wall -Wextra -lm -lpthread
>> > -o test_select_work
>>
>> Hmm...Do both RHEL 5 and RHEL 6 have high resolution timer enabled?
>>
>> If not, could you please try to boot the one which enable high resolution
>> timer with 'highres=off' to see if things change?
>
> Yes, RHEL 6 has CONFIG_HIGH_RES_TIMERS=y. I rebooted and used the
> 'highres=off' in grub and got the following results:
>
>   ./test_select_work 1000 10000 300 4
>   time_per_iteration: min: 3130.1 us avg: 3152.2 us max: 3162.2 us
> stddev: 15.0 us
>   ./test_select_work 1000 10000 300 8
>   time_per_iteration: min: 4314.6 us avg: 4407.9 us max: 4496.3 us
> stddev: 60.6 us
>   ./test_select_work 1000 10000 300 40
>   time_per_iteration: min: 8901.7 us avg: 9056.5 us max: 9121.3 us
> stddev: 57.5 us
>
> Any other info that might be helpful?
>
> Thanks,
> Dave

I made some improvements to the program to make comparisons a bit
easier and the standard deviation a bit more meaningful. It is
available in a git repo at git://github.com/daveisfera/test_sleep.git

I tried sending results from running this updated program on several
OS versions, but it must have been rejected by the filters. The info
can be found in the bugzilla:
https://bugzilla.redhat.com/show_bug.cgi?id=812148

Is there any more data I can gather or tests that I can run that can
help diagnose this problem?

Thanks in advance for any help,
Dave

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: High CPU usage of scheduler?
  2012-05-01  5:11     ` Dave Johansen
@ 2012-05-01 17:28       ` Dave Johansen
  2012-05-07 18:12         ` Dave Johansen
  0 siblings, 1 reply; 9+ messages in thread
From: Dave Johansen @ 2012-05-01 17:28 UTC (permalink / raw)
  To: linux-kernel

On Mon, Apr 30, 2012 at 10:11 PM, Dave Johansen <davejohansen@gmail.com> wrote:
> On Fri, Apr 27, 2012 at 8:23 AM, Dave Johansen <davejohansen@gmail.com> wrote:
>> On Thu, Apr 26, 2012 at 8:10 PM, Yong Zhang <yong.zhang0@gmail.com> wrote:
>>>
>>> On Thu, Apr 26, 2012 at 03:08:51PM -0700, Dave Johansen wrote:
>>> > I am looking into moving an application from RHEL 5 to RHEL 6 and I
>>> > noticed an unexpected increase in CPU usage. A little digging has led
>>> > me to believe that the scheduler may be the culprit.
>>> >
>>> > I created the attached test_select_work.c file to test this out. I
>>> > compiled it with the following command on RHEL 5:
>>> >
>>> > cc test_select_work.c -O2 -DSLEEP_TYPE=0 -Wall -Wextra -lm -lpthread
>>> > -o test_select_work
>>>
>>> Hmm...Do both RHEL 5 and RHEL 6 have high resolution timer enabled?
>>>
>>> If not, could you please try to boot the one which enable high resolution
>>> timer with 'highres=off' to see if things change?
>>
>> Yes, RHEL 6 has CONFIG_HIGH_RES_TIMERS=y. I rebooted and used the
>> 'highres=off' in grub and got the following results:
>>
>>   ./test_select_work 1000 10000 300 4
>>   time_per_iteration: min: 3130.1 us avg: 3152.2 us max: 3162.2 us
>> stddev: 15.0 us
>>   ./test_select_work 1000 10000 300 8
>>   time_per_iteration: min: 4314.6 us avg: 4407.9 us max: 4496.3 us
>> stddev: 60.6 us
>>   ./test_select_work 1000 10000 300 40
>>   time_per_iteration: min: 8901.7 us avg: 9056.5 us max: 9121.3 us
>> stddev: 57.5 us
>>
>> Any other info that might be helpful?
>>
>> Thanks,
>> Dave
>
> I made some improvements to the program to make comparisons a bit
> easier and the standard deviation a bit more meaningful. It is
> available in a git repo at git://github.com/daveisfera/test_sleep.git
>
> I tried sending results from running this updated program on several
> OS versions, but it must have been rejected by the filters. The info
> can be found in the bugzilla:
> https://bugzilla.redhat.com/show_bug.cgi?id=812148
>
> Is there any more data I can gather or tests that I can run that can
> help diagnose this problem?
>
> Thanks in advance for any help,
> Dave

I also added the images and descriptions to the stackexchange
question, so they can be more easily viewed there:
http://unix.stackexchange.com/questions/37391/high-cpu-usage-with-cfs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: High CPU usage of scheduler?
  2012-05-01 17:28       ` Dave Johansen
@ 2012-05-07 18:12         ` Dave Johansen
  2012-08-02 21:45           ` Aaron Scamehorn
  0 siblings, 1 reply; 9+ messages in thread
From: Dave Johansen @ 2012-05-07 18:12 UTC (permalink / raw)
  To: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 3062 bytes --]

On Tue, May 1, 2012 at 10:28 AM, Dave Johansen <davejohansen@gmail.com> wrote:
>
> On Mon, Apr 30, 2012 at 10:11 PM, Dave Johansen <davejohansen@gmail.com> wrote:
> > On Fri, Apr 27, 2012 at 8:23 AM, Dave Johansen <davejohansen@gmail.com> wrote:
> >> On Thu, Apr 26, 2012 at 8:10 PM, Yong Zhang <yong.zhang0@gmail.com> wrote:
> >>>
> >>> On Thu, Apr 26, 2012 at 03:08:51PM -0700, Dave Johansen wrote:
> >>> > I am looking into moving an application from RHEL 5 to RHEL 6 and I
> >>> > noticed an unexpected increase in CPU usage. A little digging has led
> >>> > me to believe that the scheduler may be the culprit.
> >>> >
> >>> > I created the attached test_select_work.c file to test this out. I
> >>> > compiled it with the following command on RHEL 5:
> >>> >
> >>> > cc test_select_work.c -O2 -DSLEEP_TYPE=0 -Wall -Wextra -lm -lpthread
> >>> > -o test_select_work
> >>>
> >>> Hmm...Do both RHEL 5 and RHEL 6 have high resolution timer enabled?
> >>>
> >>> If not, could you please try to boot the one which enable high resolution
> >>> timer with 'highres=off' to see if things change?
> >>
> >> Yes, RHEL 6 has CONFIG_HIGH_RES_TIMERS=y. I rebooted and used the
> >> 'highres=off' in grub and got the following results:
> >>
> >>   ./test_select_work 1000 10000 300 4
> >>   time_per_iteration: min: 3130.1 us avg: 3152.2 us max: 3162.2 us
> >> stddev: 15.0 us
> >>   ./test_select_work 1000 10000 300 8
> >>   time_per_iteration: min: 4314.6 us avg: 4407.9 us max: 4496.3 us
> >> stddev: 60.6 us
> >>   ./test_select_work 1000 10000 300 40
> >>   time_per_iteration: min: 8901.7 us avg: 9056.5 us max: 9121.3 us
> >> stddev: 57.5 us
> >>
> >> Any other info that might be helpful?
> >>
> >> Thanks,
> >> Dave
> >
> > I made some improvements to the program to make comparisons a bit
> > easier and the standard deviation a bit more meaningful. It is
> > available in a git repo at git://github.com/daveisfera/test_sleep.git
> >
> > I tried sending results from running this updated program on several
> > OS versions, but it must have been rejected by the filters. The info
> > can be found in the bugzilla:
> > https://bugzilla.redhat.com/show_bug.cgi?id=812148
> >
> > Is there any more data I can gather or tests that I can run that can
> > help diagnose this problem?
> >
> > Thanks in advance for any help,
> > Dave
>
> I also added the images and descriptions to the stackexchange
> question, so they can be more easily viewed there:
> http://unix.stackexchange.com/questions/37391/high-cpu-usage-with-cfs

Here's an updated version that measures the user and system time of
each thread and also outputs those results. It also has a correction
for how the mean and standard deviation are updated inside of the
outer_iterations loop.

The updated results are available here:
http://unix.stackexchange.com/questions/37391/high-cpu-usage-with-cfs
and here:
https://bugzilla.redhat.com/show_bug.cgi?id=812148

Any other tests I can run to help diagnose the problem?

Thanks,
Dave

[-- Attachment #2: test_sleep.c --]
[-- Type: text/x-csrc, Size: 11284 bytes --]

#include <limits.h>
#include <math.h>
#include <poll.h>
#include <pthread.h>
#include <sched.h>
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <unistd.h>
#include <sys/select.h>
#include <sys/syscall.h>
#include <sys/time.h>


// Apparently GLIBC doesn't provide a wrapper for this function so provide it here
#ifndef HAS_GETTID
pid_t gettid(void)
{
  return syscall(SYS_gettid);
}
#endif


// The different type of sleep that are supported
enum sleep_type {
  SLEEP_TYPE_NONE,
  SLEEP_TYPE_SELECT,
  SLEEP_TYPE_POLL,
  SLEEP_TYPE_USLEEP,
  SLEEP_TYPE_YIELD,
  SLEEP_TYPE_PTHREAD_COND,
  SLEEP_TYPE_NANOSLEEP,
};

// Information returned by the processing thread
struct thread_res {
  long long clock;
  long long user;
  long long sys;
};

// Function type for doing work with a sleep
typedef struct thread_res *(*work_func)(const int pid, const int sleep_time, const int num_iterations, const int work_size);

// Information passed to the thread
struct thread_info {
  pid_t pid;
  int sleep_time;
  int num_iterations;
  int work_size;
  work_func func;
};


inline void get_thread_times(pid_t pid, pid_t tid, unsigned long long *utime, unsigned long long *stime)
{
  char filename[FILENAME_MAX];
  FILE *f;

  sprintf(filename, "/proc/%d/task/%d/stat", pid, tid);
  f = fopen(filename, "r");

  if (f == NULL) {
    *utime = 0;
    *stime = 0;
    return;
  }

  fscanf(f, "%*d %*s %*c %*d %*d %*d %*d %*d %*u %*u %*u %*u %*u %Lu %Lu", utime, stime);

  fclose(f);
}

// In order to make SLEEP_TYPE a run-time parameter function pointers are used.
// The function pointer could have been to the sleep function being used, but
// then that would mean an extra function call inside of the "work loop" and I
// wanted to keep the measurements as tight as possible and the extra work being
// done to be as small/controlled as possible so instead the work is declared as
// a seriees of macros that are called in all of the sleep functions. The code
// is a bit uglier this way, but I believe it results in a more accurate test.

// Fill in a buffer with random numbers (taken from latt.c by Jens Axboe <jens.axboe@oracle.com>)
#define DECLARE_FUNC(NAME) struct thread_res *do_work_##NAME(const int pid, const int sleep_time, const int num_iterations, const int work_size)

#define DECLARE_WORK() \
  int *buf; \
  int pseed; \
  int inum, bnum; \
  pid_t tid; \
  struct timeval clock_before, clock_after; \
  unsigned long long user_before, user_after; \
  unsigned long long sys_before, sys_after; \
  struct thread_res *diff; \
  tid = gettid(); \
  buf = malloc(work_size * sizeof(*buf)); \
  diff = malloc(sizeof(*diff)); \
  get_thread_times(pid, tid, &user_before, &sys_before); \
  gettimeofday(&clock_before, NULL)
  
#define DO_WORK(SLEEP_FUNC) \
  for (inum=0; inum<num_iterations; ++inum) { \
    SLEEP_FUNC \
     \
    pseed = 1; \
    for (bnum=0; bnum<work_size; ++bnum) { \
      pseed = pseed * 1103515245 + 12345; \
      buf[bnum] = (pseed / 65536) % 32768; \
    } \
  } \

#define FINISH_WORK() \
  gettimeofday(&clock_after, NULL); \
  get_thread_times(pid, tid, &user_after, &sys_after); \
  diff->clock = 1000000LL * (clock_after.tv_sec - clock_before.tv_sec); \
  diff->clock += clock_after.tv_usec - clock_before.tv_usec; \
  diff->user = user_after - user_before; \
  diff->sys = sys_after - sys_before; \
  free(buf); \
  return diff

DECLARE_FUNC(nosleep)
{
  DECLARE_WORK();

  // Let the compiler know that sleep_time isn't used in this function
  (void)sleep_time;

  DO_WORK();

  FINISH_WORK();
}

DECLARE_FUNC(select)
{
  struct timeval ts;
  DECLARE_WORK();

  DO_WORK(
    ts.tv_sec = 0;
    ts.tv_usec = sleep_time;
    select(0, 0, 0, 0, &ts);
    );

  FINISH_WORK();
}

DECLARE_FUNC(poll)
{
  struct pollfd pfd;
  const int sleep_time_ms = sleep_time / 1000;
  DECLARE_WORK();

  pfd.fd = 0;
  pfd.events = 0;

  DO_WORK(
    poll(&pfd, 1, sleep_time_ms);
    );

  FINISH_WORK();
}

DECLARE_FUNC(usleep)
{
  DECLARE_WORK();

  DO_WORK(
    usleep(sleep_time);
    );

  FINISH_WORK();
}

DECLARE_FUNC(yield)
{
  DECLARE_WORK();

  // Let the compiler know that sleep_time isn't used in this function
  (void)sleep_time;

  DO_WORK(
    sched_yield();
    );

  FINISH_WORK();
}

DECLARE_FUNC(pthread_cond)
{
  pthread_cond_t cond  = PTHREAD_COND_INITIALIZER;
  pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
  struct timespec ts;
  const int sleep_time_ns = sleep_time * 1000;
  DECLARE_WORK();

  pthread_mutex_lock(&mutex);

  DO_WORK(
    clock_gettime(CLOCK_REALTIME, &ts);
    ts.tv_nsec += sleep_time_ns;
    if (ts.tv_nsec >= 1000000000) {
      ts.tv_sec += 1;
      ts.tv_nsec -= 1000000000;
    }
    pthread_cond_timedwait(&cond, &mutex, &ts);
    );

  pthread_mutex_unlock(&mutex);

  pthread_cond_destroy(&cond);
  pthread_mutex_destroy(&mutex);

  FINISH_WORK();
}

DECLARE_FUNC(nanosleep)
{
  struct timespec req, rem;
  const int sleep_time_ns = sleep_time * 1000;
  DECLARE_WORK();

  DO_WORK(
    req.tv_sec = 0;
    req.tv_nsec = sleep_time_ns;
    nanosleep(&req, &rem);
    );

  FINISH_WORK();
}

void *do_test(void *arg)
{
  const struct thread_info *tinfo = (struct thread_info *)arg;

  // Call the function to do the work
  return (*tinfo->func)(tinfo->pid, tinfo->sleep_time, tinfo->num_iterations, tinfo->work_size);
}

struct thread_res_stats {
  double min;
  double max;
  double avg;
  double stddev;
  double prev_avg;
};

#ifdef LLONG_MAX
  #define THREAD_RES_STATS_INITIALIZER {LLONG_MAX, LLONG_MIN, 0, 0, 0}
#else
  #define THREAD_RES_STATS_INITIALIZER {LONG_MAX, LONG_MIN, 0, 0, 0}
#endif

void update_stats(struct thread_res_stats *stats, long long value, int num_samples, int num_iterations, double scale_to_usecs)
{
  // Calculate the average time per iteration
  double value_per_iteration = value * scale_to_usecs / num_iterations;

  // Update the max and min
  if (value_per_iteration < stats->min)
    stats->min = value_per_iteration;
  if (value_per_iteration > stats->max)
    stats->max = value_per_iteration;
  // Update the average
  stats->avg += (value_per_iteration - stats->avg) / (double)(num_samples);
  // Update the standard deviation
  stats->stddev += (value_per_iteration - stats->prev_avg) * (value_per_iteration - stats->avg);
  // And record the current average for use in the next update
  stats->prev_avg= stats->avg;
}

void print_stats(const char *name, const struct thread_res_stats *stats)
{
  printf("%s: min: %.1f us avg: %.1f us max: %.1f us stddev: %.1f us\n",
      name,
      stats->min,
      stats->avg,
      stats->max,
      stats->stddev);
}

int main(int argc, char **argv)
{
  if (argc <= 6) {
    printf("Usage: %s <sleep_time> <outer_iterations> <inner_iterations> <work_size> <num_threads> <sleep_type>\n", argv[0]);
    printf("  outer_iterations: Number of iterations for each thread (used to calculate statistics)\n");
    printf("  inner_iterations: Number of work/sleep cycles performed in each thread (used to improve consistency/observability))\n");
    printf("  work_size: Number of array elements (in kb) that are filled with psuedo-random numbers\n");
    printf("  num_threads: Number of threads to spawn and perform work/sleep cycles in\n");
    printf("  sleep_type: 0=none 1=select 2=poll 3=usleep 4=yield 5=pthread_cond 6=nanosleep\n");
    return -1;
  }

  struct thread_info tinfo;
  int outer_iterations;
  int sleep_type;
  int s, inum, tnum, num_samples, num_threads;
  pthread_attr_t attr;
  pthread_t *threads;
  struct thread_res *res;
  struct thread_res **times;
  // Track the stats for each of the measurements
  struct thread_res_stats stats_clock = THREAD_RES_STATS_INITIALIZER;
  struct thread_res_stats stats_user = THREAD_RES_STATS_INITIALIZER;
  struct thread_res_stats stats_sys = THREAD_RES_STATS_INITIALIZER;
  // Calculate the conversion factor from clock_t to seconds
  const long clocks_per_sec = sysconf(_SC_CLK_TCK);
  const double clocks_to_usec = 1000000 / (double)clocks_per_sec;

  // Get the parameters
  tinfo.pid = getpid();
  tinfo.sleep_time = atoi(argv[1]);
  outer_iterations = atoi(argv[2]);
  tinfo.num_iterations = atoi(argv[3]);
  tinfo.work_size = atoi(argv[4]) * 1024;
  num_threads = atoi(argv[5]);
  sleep_type = atoi(argv[6]);
  switch (sleep_type) {
    case SLEEP_TYPE_NONE:   tinfo.func = &do_work_nosleep; break;
    case SLEEP_TYPE_SELECT: tinfo.func = &do_work_select;  break;
    case SLEEP_TYPE_POLL:   tinfo.func = &do_work_poll;    break;
    case SLEEP_TYPE_USLEEP: tinfo.func = &do_work_usleep;  break;
    case SLEEP_TYPE_YIELD:  tinfo.func = &do_work_yield;   break;
    case SLEEP_TYPE_PTHREAD_COND:  tinfo.func = &do_work_pthread_cond;   break;
    case SLEEP_TYPE_NANOSLEEP:  tinfo.func = &do_work_nanosleep;   break;
    default:
      printf("Invalid sleep type: %d\n", sleep_type);
      return -7;
  }

  // Initialize the thread creation attributes
  s = pthread_attr_init(&attr);
  if (s != 0) {
    printf("Error initializing thread attributes\n");
    return -2;
  }

  // Allocate the memory to track the threads
  threads = calloc(num_threads, sizeof(*threads));
  times = calloc(num_threads, sizeof(*times));
  if (threads == NULL) {
    printf("Error allocating memory to track threads\n");
    return -3;
  }

  // Initialize the number of samples
  num_samples = 0;
  // Perform the requested number of outer iterations
  for (inum=0; inum<outer_iterations; ++inum) {
    // Start all of the threads
    for (tnum=0; tnum<num_threads; ++tnum) {
      s = pthread_create(&threads[tnum], &attr, &do_test, &tinfo);

      if (s != 0) {
        printf("Error starting thread\n");
        return -4;
      }
    }

    // Wait for all the threads to finish
    for (tnum=0; tnum<num_threads; ++tnum) {
      s = pthread_join(threads[tnum], (void **)(&res));

      if (s != 0) {
        printf("Error waiting for thread\n");
        return -6;
      }

      // Save the result for processing when they're all done
      times[tnum] = res;
    }

    // For each of the threads
    for (tnum=0; tnum<num_threads; ++tnum) {
      // Increment the number of samples in the statistics
      ++num_samples;
      // Update the statistics with this measurement
      update_stats(&stats_clock, times[tnum]->clock, num_samples, tinfo.num_iterations, 1);
      update_stats(&stats_user, times[tnum]->user, num_samples, tinfo.num_iterations, clocks_to_usec);
      update_stats(&stats_sys, times[tnum]->sys, num_samples, tinfo.num_iterations, clocks_to_usec);
      // And clean it up
      free(times[tnum]);
    }
  }

  // Clean up the thread creation attributes
  s = pthread_attr_destroy(&attr);
  if (s != 0) {
    printf("Error cleaning up thread attributes\n");
    return -5;
  }

  // Finish the calculation of the standard deviation
  stats_clock.stddev = sqrtf(stats_clock.stddev / (num_samples - 1));
  stats_user.stddev = sqrtf(stats_user.stddev / (num_samples - 1));
  stats_sys.stddev = sqrtf(stats_sys.stddev / (num_samples - 1));

  // Print out the statistics of the times
  print_stats("gettimeofday_per_iteration", &stats_clock);
  print_stats("utime_per_iteration", &stats_user);
  print_stats("stime_per_iteration", &stats_sys);

  // Clean up the allocated threads and times
  free(threads);
  free(times);

  return 0;
}

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: High CPU usage of scheduler?
  2012-05-07 18:12         ` Dave Johansen
@ 2012-08-02 21:45           ` Aaron Scamehorn
  0 siblings, 0 replies; 9+ messages in thread
From: Aaron Scamehorn @ 2012-08-02 21:45 UTC (permalink / raw)
  To: linux-kernel

Dave Johansen <davejohansen <at> gmail.com> writes:

> 
> Here's an updated version that measures the user and system time of
> each thread and also outputs those results. It also has a correction
> for how the mean and standard deviation are updated inside of the
> outer_iterations loop.
> 
> The updated results are available here:
> http://unix.stackexchange.com/questions/37391/high-cpu-usage-with-cfs
> and here:
> https://bugzilla.redhat.com/show_bug.cgi?id=812148
> 
> Any other tests I can run to help diagnose the problem?
> 
> Thanks,
> Dave
> 
> Attachment (test_sleep.c): text/x-csrc, 11 KiB


I tried to take a look at that bugzilla, but Redhat is saying that it is "an
internal bugzilla, and not visible to the public."

Have there been any updates on this?

Thanks,
Aaron


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2012-08-02 21:50 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-04-26 22:08 High CPU usage of scheduler? Dave Johansen
2012-04-27  3:10 ` Yong Zhang
2012-04-27 15:23   ` Dave Johansen
2012-04-30 17:29     ` Dave Johansen
2012-04-30 18:50     ` Dave Johansen
2012-05-01  5:11     ` Dave Johansen
2012-05-01 17:28       ` Dave Johansen
2012-05-07 18:12         ` Dave Johansen
2012-08-02 21:45           ` Aaron Scamehorn

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).