linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: VM: qsbench numbers
@ 2001-11-05 15:30 Lorenzo Allegrucci
  0 siblings, 0 replies; 5+ messages in thread
From: Lorenzo Allegrucci @ 2001-11-05 15:30 UTC (permalink / raw)
  To: linux-kernel


[Forgot to CC linux-kernel people, sorry]

At 17.03 04/11/01 -0800, you wrote:
>
>On Sun, 4 Nov 2001, Lorenzo Allegrucci wrote:
>> >
>> >Does "free" after a run has completed imply that there's still lots of
>> >swap used? We _should_ have gotten rid of it at "free_swap_and_cache()"
>> >time, but if we missed it..
>>
>> 70.590u 7.640s 2:31.06 51.7%    0+0k 0+0io 19036pf+0w
>> lenstra:~/src/qsort> free
>>              total       used       free     shared    buffers     cached
>> Mem:        255984       6008     249976          0        100       1096
>> -/+ buffers/cache:       4812     251172
>> Swap:       195512       5080     190432
>
>That's not a noticeable amount, and is perfectly explainable by simply
>having deamons that got swapped out with truly inactive pages. So a
>swapcache leak does not seem to be the reason for the unstable numbers.
>
>> >What happens if you make the "vm_swap_full()" define in <linux/swap.h> be
>> >unconditionally defined to "1"?
>>
>> 70.530u 7.290s 2:33.26 50.7%    0+0k 0+0io 19689pf+0w
>> 70.830u 7.100s 2:29.52 52.1%    0+0k 0+0io 18488pf+0w
>> 70.560u 6.840s 2:28.66 52.0%    0+0k 0+0io 18203pf+0w
>>
>> Performace improved and numbers stabilized.
>
>Indeed.
>
>Mind doing some more tests? In particular, the "vm_swap_full()" macro is
>only used in two places: mm/memory.c and mm/swapfile.c. Are you willing to
>test _which_ one (or is it both together) it is that seems to bring on the
>unstable numbers?

mm/memory.c:
#undef vm_swap_full()
#define vm_swap_full() 1

swap=200M
70.480u 7.440s 2:35.74 50.0%    0+0k 0+0io 19897pf+0w
70.640u 7.280s 2:28.87 52.3%    0+0k 0+0io 18453pf+0w
70.750u 7.170s 2:36.26 49.8%    0+0k 0+0io 19719pf+0w

swap=400M
70.120u 6.940s 2:29.55 51.5%    0+0k 0+0io 18598pf+0w
70.160u 7.320s 2:37.34 49.2%    0+0k 0+0io 19720pf+0w
70.020u 11.310s 3:15.09 41.6%   0+0k 0+0io 28330pf+0w


mm/memory.c:
/* #undef vm_swap_full() */
/* #define vm_swap_full() 1 */

mm/swapfile.c:
#undef vm_swap_full()
#define vm_swap_full() 1

swap=200M
69.610u 7.830s 2:33.47 50.4%    0+0k 0+0io 19630pf+0w
70.260u 7.810s 2:54.06 44.8%    0+0k 0+0io 22816pf+0w
70.420u 7.420s 2:42.71 47.8%    0+0k 0+0io 20655pf+0w

swap=400M
70.240u 6.980s 2:40.37 48.1%    0+0k 0+0io 20437pf+0w
70.430u 6.450s 2:25.36 52.8%    0+0k 0+0io 18400pf+0w
70.270u 6.420s 2:25.52 52.7%    0+0k 0+0io 18267pf+0w
70.850u 6.530s 2:35.82 49.6%    0+0k 0+0io 19481pf+0w

These above are bad numbers but the worst is still to come..

I repeated the "What happens if you make the "vm_swap_full()" define
in <linux/swap.h> be unconditionally defined to "1"?" thing.

swap=200M
70.510u 7.510s 2:33.91 50.6%    0+0k 0+0io 19584pf+0w
70.100u 7.620s 2:42.20 47.9%    0+0k 0+0io 20562pf+0w
69.840u 7.910s 2:51.61 45.3%    0+0k 0+0io 22541pf+0w
70.370u 7.910s 2:52.06 45.4%    0+0k 0+0io 22793pf+0w

swap=400M
70.560u 7.580s 2:37.38 49.6%    0+0k 0+0io 19962pf+0w
70.120u 7.560s 2:45.04 47.0%    0+0k 0+0io 20403pf+0w
70.390u 7.130s 2:29.82 51.7%    0+0k 0+0io 18159pf+0w <-
70.080u 7.190s 2:29.63 51.6%    0+0k 0+0io 18580pf+0w <-
70.300u 6.810s 2:29.70 51.5%    0+0k 0+0io 18267pf+0w <-
69.770u 7.670s 2:49.68 45.6%    0+0k 0+0io 20980pf+0w

Well, numbers are unstable again.

Either I made some error patching the kernel, or my previous test:
>> 70.530u 7.290s 2:33.26 50.7%    0+0k 0+0io 19689pf+0w
>> 70.830u 7.100s 2:29.52 52.1%    0+0k 0+0io 18488pf+0w
>> 70.560u 6.840s 2:28.66 52.0%    0+0k 0+0io 18203pf+0w
was a statistical "fluctuation".
I don't know, and I would be very thankful if somebody can
confirm/deny these results.


-- 
Lorenzo

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: VM: qsbench numbers
  2001-11-04 21:17   ` Lorenzo Allegrucci
@ 2001-11-05  1:03     ` Linus Torvalds
  0 siblings, 0 replies; 5+ messages in thread
From: Linus Torvalds @ 2001-11-05  1:03 UTC (permalink / raw)
  To: Lorenzo Allegrucci; +Cc: linux-kernel


On Sun, 4 Nov 2001, Lorenzo Allegrucci wrote:
> >
> >Does "free" after a run has completed imply that there's still lots of
> >swap used? We _should_ have gotten rid of it at "free_swap_and_cache()"
> >time, but if we missed it..
>
> 70.590u 7.640s 2:31.06 51.7%    0+0k 0+0io 19036pf+0w
> lenstra:~/src/qsort> free
>              total       used       free     shared    buffers     cached
> Mem:        255984       6008     249976          0        100       1096
> -/+ buffers/cache:       4812     251172
> Swap:       195512       5080     190432

That's not a noticeable amount, and is perfectly explainable by simply
having deamons that got swapped out with truly inactive pages. So a
swapcache leak does not seem to be the reason for the unstable numbers.

> >What happens if you make the "vm_swap_full()" define in <linux/swap.h> be
> >unconditionally defined to "1"?
>
> 70.530u 7.290s 2:33.26 50.7%    0+0k 0+0io 19689pf+0w
> 70.830u 7.100s 2:29.52 52.1%    0+0k 0+0io 18488pf+0w
> 70.560u 6.840s 2:28.66 52.0%    0+0k 0+0io 18203pf+0w
>
> Performace improved and numbers stabilized.

Indeed.

Mind doing some more tests? In particular, the "vm_swap_full()" macro is
only used in two places: mm/memory.c and mm/swapfile.c. Are you willing to
test _which_ one (or is it both together) it is that seems to bring on the
unstable numbers?

		Linus


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: VM: qsbench numbers
  2001-11-04 14:11 ` Lorenzo Allegrucci
  2001-11-04 17:18   ` Linus Torvalds
@ 2001-11-04 21:17   ` Lorenzo Allegrucci
  2001-11-05  1:03     ` Linus Torvalds
  1 sibling, 1 reply; 5+ messages in thread
From: Lorenzo Allegrucci @ 2001-11-04 21:17 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 4279 bytes --]

At 09.18 04/11/01 -0800, Linus Torvalds wrote:
>
>On Sun, 4 Nov 2001, Lorenzo Allegrucci wrote:
>>
>> I begin with the last Linus' kernel, three runs and kswapd CPU
>> time appended.
>
>It's interesting how your numbers decrease with more swap-space. That,
>together with the fact that the "more swap space" case also degrades the
>second time around seems to imply that we leave swap-cache pages around
>after they aren't used.
>
>Does "free" after a run has completed imply that there's still lots of
>swap used? We _should_ have gotten rid of it at "free_swap_and_cache()"
>time, but if we missed it..

lenstra:~/src/qsort> free
             total       used       free     shared    buffers     cached
Mem:        255984      16760     239224          0       1092       8008
-/+ buffers/cache:       7660     248324
Swap:       195512          0     195512
lenstra:~/src/qsort> time ./qsbench -n 90000000 -p 1 -s 140175100
70.590u 7.640s 2:31.06 51.7%    0+0k 0+0io 19036pf+0w
lenstra:~/src/qsort> free
             total       used       free     shared    buffers     cached
Mem:        255984       6008     249976          0        100       1096
-/+ buffers/cache:       4812     251172
Swap:       195512       5080     190432

and with more swap..

lenstra:~/src/qsort> free
             total       used       free     shared    buffers     cached
Mem:        255984      13488     242496          0        532       5360
-/+ buffers/cache:       7596     248388
Swap:       390592          0     390592
lenstra:~/src/qsort> time ./qsbench -n 90000000 -p 1 -s 140175100
70.180u 7.650s 2:43.22 47.6%    0+0k 0+0io 21019pf+0w
lenstra:~/src/qsort> free
             total       used       free     shared    buffers     cached
Mem:        255984       6596     249388          0        108       1116
-/+ buffers/cache:       5372     250612
Swap:       390592       5576     385016
lenstra:~/src/qsort> time ./qsbench -n 90000000 -p 1 -s 140175100
71.030u 7.040s 2:49.45 46.0%    0+0k 0+0io 22734pf+0w
lenstra:~/src/qsort> free
             total       used       free     shared    buffers     cached
Mem:        255984       8808     247176          0        108       1152
-/+ buffers/cache:       7548     248436
Swap:       390592       7948     382644


>What happens if you make the "vm_swap_full()" define in <linux/swap.h> be
>unconditionally defined to "1"?

lenstra:~/src/qsort> free
             total       used       free     shared    buffers     cached
Mem:        256000      16772     239228          0       1104       8008
-/+ buffers/cache:       7660     248340
Swap:       195512          0     195512
lenstra:~/src/qsort> time ./qsbench -n 90000000 -p 1 -s 140175100
70.530u 7.290s 2:33.26 50.7%    0+0k 0+0io 19689pf+0w
lenstra:~/src/qsort> free
             total       used       free     shared    buffers     cached
Mem:        256000       5132     250868          0        116       1144
-/+ buffers/cache:       3872     252128
Swap:       195512       3748     191764

..and now with 400M of swap:

lenstra:~/src/qsort> free
             total       used       free     shared    buffers     cached
Mem:        256000      13096     242904          0        504       4904
-/+ buffers/cache:       7688     248312
Swap:       390592          0     390592
lenstra:~/src/qsort> time ./qsbench -n 90000000 -p 1 -s 140175100
70.830u 7.100s 2:29.52 52.1%    0+0k 0+0io 18488pf+0w
lenstra:~/src/qsort> free
             total       used       free     shared    buffers     cached
Mem:        256000       4980     251020          0        108       1132
-/+ buffers/cache:       3740     252260
Swap:       390592       3840     386752
lenstra:~/src/qsort> time ./qsbench -n 90000000 -p 1 -s 140175100
70.560u 6.840s 2:28.66 52.0%    0+0k 0+0io 18203pf+0w
lenstra:~/src/qsort> free
             total       used       free     shared    buffers     cached
Mem:        256000       5044     250956          0        108       1112
-/+ buffers/cache:       3824     252176
Swap:       390592       3896     386696

Performace improved and numbers stabilized.

>That should make us be more aggressive
>about freeing those swap-cache pages, and it would be interesting to see
>if it also stabilizes your numbers.
>
>		Linus

I attach qsbench.c

[-- Attachment #2: Type: text/plain, Size: 2449 bytes --]

/*
 * Copyright (C) 2001 Lorenzo Allegrucci (lenstra@tiscalinet.it)
 * Licensed under the GPL
 */
#include <stdio.h>
#include <stdlib.h>
#include <malloc.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>

#define MAX_PROCS	1024

/**
 * quick_sort - Sort in the range [l, r]
 */
void quick_sort(int a[], int l, int r)
{
	int i, j, p, tmp;
	int m, min, max;

	i = l;
	j = r;
	m = (l + r) >> 1;

	if (a[m] >= a[l]) {
		max = a[m];
		min = a[l];
	} else {
		max = a[l];
		min = a[m];
	}

	if (a[r] >= max)
		p = max;
	else {
		if (a[r] >= min)
			p = a[r];
		else
			p = min;
	}

	do {
		while (a[i] < p)
			i++;
		while (p < a[j])
			j--;
		if (i <= j) {
			tmp = a[i];
			a[i] = a[j];
			a[j] = tmp;
			i++;
			j--;
		}
	} while (i <= j);

	if (l < j)
		quick_sort(a, l, j);
	if (i < r)
		quick_sort(a, i, r);
}


void do_qsort(int n, int s)
{
	int * a, i, errors = 0;

	if ((a = malloc(sizeof(int) * n)) == NULL) {
		perror("malloc");
		exit(1);
	}

	srand(s);
	//printf("seed = %d\n", s);

	for (i = 0; i < n; i++)
		a[i] = rand();

	quick_sort(a, 0, n - 1);

	//printf("verify... "); fflush(stdout);
	for (i = 0; i < n - 1; i++)
		if (a[i] > a[i + 1])
			errors++;
	//printf("done.\n");
	if (errors)
		fprintf(stderr, "WARNING: %d errors.\n", errors);
	free(a);
	exit(0);
}


void start_procs(int n, int p, int s)
{
	int i, pid[MAX_PROCS];
	int status;

	if (p > MAX_PROCS)
		p = MAX_PROCS;

	for (i = 0; i < p; i++) {
		pid[i] = fork();
		if (pid[i] == 0)
			do_qsort(n, s);
		else if (pid[i] < 0)
			perror("fork");
	}

	for (i = 0; i < p; i++)
		waitpid(pid[i], &status, 0);
}

void usage(void)
{
	fprintf(stderr, "Usage: qs [-h] [-n nr_elems] [-p nr_procs]"
			" [-s seed]\n");
	exit(1);
}


int main(int argc, char * argv[])
{
	char *n = "1000000", *p = "1", *s = "1";
	int nr_elems, nr_procs, seed;
	int c;

	if (argc == 1)
		usage();

	while (1) {
		c = getopt(argc, argv, "hn:p:s:V");
		if (c == -1)
			break;

		switch (c) {
		case 'h':
			usage();
		case 'n':
			n = optarg;
			break;
		case 'p':
			p = optarg;
			break;
		case 's':
			s = optarg;
			break;
		case 'V':
			printf("Version 0.93\n");
			return 1;
		case '?':
			return 1;
		}
	}

	nr_elems = atoi(n);
	nr_procs = atoi(p);
	seed = atoi(s);
	start_procs(nr_elems, nr_procs, seed);

	return 0;
}

[-- Attachment #3: Type: text/plain, Size: 13 bytes --]



-- 
Lorenzo

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: VM: qsbench numbers
  2001-11-04 14:11 ` Lorenzo Allegrucci
@ 2001-11-04 17:18   ` Linus Torvalds
  2001-11-04 21:17   ` Lorenzo Allegrucci
  1 sibling, 0 replies; 5+ messages in thread
From: Linus Torvalds @ 2001-11-04 17:18 UTC (permalink / raw)
  To: Lorenzo Allegrucci; +Cc: linux-kernel


On Sun, 4 Nov 2001, Lorenzo Allegrucci wrote:
>
> I begin with the last Linus' kernel, three runs and kswapd CPU
> time appended.

It's interesting how your numbers decrease with more swap-space. That,
together with the fact that the "more swap space" case also degrades the
second time around seems to imply that we leave swap-cache pages around
after they aren't used.

Does "free" after a run has completed imply that there's still lots of
swap used? We _should_ have gotten rid of it at "free_swap_and_cache()"
time, but if we missed it..

What happens if you make the "vm_swap_full()" define in <linux/swap.h> be
unconditionally defined to "1"? That should make us be more aggressive
about freeing those swap-cache pages, and it would be interesting to see
if it also stabilizes your numbers.

		Linus


^ permalink raw reply	[flat|nested] 5+ messages in thread

* VM: qsbench numbers
@ 2001-11-04 14:11 ` Lorenzo Allegrucci
  2001-11-04 17:18   ` Linus Torvalds
  2001-11-04 21:17   ` Lorenzo Allegrucci
  0 siblings, 2 replies; 5+ messages in thread
From: Lorenzo Allegrucci @ 2001-11-04 14:11 UTC (permalink / raw)
  To: linux-kernel; +Cc: Linus Torvalds


I begin with the last Linus' kernel, three runs and kswapd CPU
time appended.

Linux-2.4.14-pre8:
lenstra:~/src/qsort> time ./qsbench -n 90000000 -p 1 -s 140175100
70.270u 7.330s 2:33.29 50.6%    0+0k 0+0io 19670pf+0w
lenstra:~/src/qsort> time ./qsbench -n 90000000 -p 1 -s 140175100
70.090u 6.890s 2:32.29 50.5%    0+0k 0+0io 18337pf+0w
lenstra:~/src/qsort> time ./qsbench -n 90000000 -p 1 -s 140175100
70.510u 6.660s 2:29.29 51.6%    0+0k 0+0io 18463pf+0w
0:01 kswapd

Double swap space (from 200M to 400M):
lenstra:~/src/qsort> time ./qsbench -n 90000000 -p 1 -s 140175100
70.510u 6.390s 2:24.39 53.2%    0+0k 0+0io 17902pf+0w
lenstra:~/src/qsort> time ./qsbench -n 90000000 -p 1 -s 140175100
70.600u 7.600s 2:56.97 44.1%    0+0k 0+0io 23599pf+0w
lenstra:~/src/qsort> time ./qsbench -n 90000000 -p 1 -s 140175100
70.370u 7.340s 2:50.26 45.6%    0+0k 0+0io 22295pf+0w
0:03 kswapd

This is interesting.
Runs 2 and 3 are slower, even with more swap space and the new
VM seems to have lost its proverbial stability on performance.

Old results below, for performance and behaviour comparisons.

Linux-2.4.14-pre7:
lenstra:~/src/qsort> time ./qsbench -n 90000000 -p 1 -s 140175100
Out of Memory: Killed process 224 (qsbench).
17.770u 3.160s 1:19.95 26.1%    0+0k 0+0io 13294pf+0w
lenstra:~/src/qsort> time ./qsbench -n 90000000 -p 1 -s 140175100
Out of Memory: Killed process 226 (qsbench).
26.030u 15.530s 1:39.39 41.8%   0+0k 0+0io 13283pf+0w
lenstra:~/src/qsort> time ./qsbench -n 90000000 -p 1 -s 140175100
Out of Memory: Killed process 228 (qsbench).
29.350u 41.360s 2:27.63 47.8%   0+0k 0+0io 15214pf+0w
0:12 kswapd

Double swap space:
lenstra:~/src/qsort> time ./qsbench -n 90000000 -p 1 -s 140175100
70.530u 2.920s 2:16.35 53.8%    0+0k 0+0io 17575pf+0w
lenstra:~/src/qsort> time ./qsbench -n 90000000 -p 1 -s 140175100
70.510u 3.160s 2:19.79 52.7%    0+0k 0+0io 17639pf+0w
lenstra:~/src/qsort> time ./qsbench -n 90000000 -p 1 -s 140175100
70.540u 3.270s 2:17.39 53.7%    0+0k 0+0io 17544pf+0w
0:01 kswapd


Linux-2.4.14-pre6:
lenstra:~/src/qsort> time ./qsbench -n 90000000 -p 1 -s 140175100
Out of Memory: Killed process 224 (qsbench).
69.890u 3.430s 2:12.48 55.3%    0+0k 0+0io 16374pf+0w
lenstra:~/src/qsort> time ./qsbench -n 90000000 -p 1 -s 140175100
Out of Memory: Killed process 226 (qsbench).
69.550u 2.990s 2:11.31 55.2%    0+0k 0+0io 15374pf+0w
lenstra:~/src/qsort> time ./qsbench -n 90000000 -p 1 -s 140175100
Out of Memory: Killed process 228 (qsbench).
69.480u 3.100s 2:13.33 54.4%    0+0k 0+0io 15950pf+0w
0:01 kswapd


Linux-2.4.14-pre5:
lenstra:~/src/qsort> time ./qsbench -n 90000000 -p 1 -s 140175100
70.340u 3.450s 2:13.62 55.2%    0+0k 0+0io 16829pf+0w
lenstra:~/src/qsort> time ./qsbench -n 90000000 -p 1 -s 140175100
70.590u 2.940s 2:15.48 54.2%    0+0k 0+0io 17182pf+0w
lenstra:~/src/qsort> time ./qsbench -n 90000000 -p 1 -s 140175100
70.140u 3.480s 2:14.66 54.6%    0+0k 0+0io 17122pf+0w
0:01 kswapd

2.4.14-pre5 has the best VM for qsbench :)


Linux-2.4.13:
lenstra:~/src/qsort> time ./qsbench -n 90000000 -p 1 -s 140175100
71.260u 2.150s 2:20.68 52.1%    0+0k 0+0io 20173pf+0w
lenstra:~/src/qsort> time ./qsbench -n 90000000 -p 1 -s 140175100
71.020u 2.050s 2:18.78 52.6%    0+0k 0+0io 20353pf+0w
lenstra:~/src/qsort> time ./qsbench -n 90000000 -p 1 -s 140175100
70.810u 2.080s 2:19.50 52.2%    0+0k 0+0io 20413pf+0w
0:06 kswapd


Linux-2.4.11:
lenstra:~/src/qsort> time ./qsbench -n 90000000 -p 1 -s 140175100
71.020u 1.650s 2:20.74 51.6%    0+0k 0+0io 10652pf+0w
lenstra:~/src/qsort> time ./qsbench -n 90000000 -p 1 -s 140175100
71.070u 1.650s 2:21.51 51.3%    0+0k 0+0io 10499pf+0w
lenstra:~/src/qsort> time ./qsbench -n 90000000 -p 1 -s 140175100
70.790u 1.670s 2:21.01 51.3%    0+0k 0+0io 10641pf+0w
0:04 kswapd


Linux-2.4.10:
lenstra:~/src/qsort> time ./qsbench -n 90000000 -p 1 -s 140175100
70.410u 1.870s 2:45.25 43.7%    0+0k 0+0io 16088pf+0w
lenstra:~/src/qsort> time ./qsbench -n 90000000 -p 1 -s 140175100
70.910u 1.840s 2:45.16 44.0%    0+0k 0+0io 16338pf+0w
lenstra:~/src/qsort> time ./qsbench -n 90000000 -p 1 -s 140175100
71.310u 1.910s 2:45.20 44.3%    0+0k 0+0io 16211pf+0w
0:03 kswapd


Linux-2.4.13-ac4:
lenstra:~/src/qsort> time ./qsbench -n 90000000 -p 1 -s 140175100
70.800u 3.470s 3:04.15 40.3%    0+0k 0+0io 13916pf+0w
lenstra:~/src/qsort> time ./qsbench -n 90000000 -p 1 -s 140175100
71.530u 3.930s 3:13.90 38.9%    0+0k 0+0io 14101pf+0w
lenstra:~/src/qsort> time ./qsbench -n 90000000 -p 1 -s 140175100
71.260u 3.640s 3:03.54 40.8%    0+0k 0+0io 13047pf+0w
0:08 kswapd



-- 
Lorenzo

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2001-11-05 15:28 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-11-05 15:30 VM: qsbench numbers Lorenzo Allegrucci
     [not found] <Pine.LNX.4.33.0111040913370.6919-100000@penguin.transmeta. com>
2001-11-04 14:11 ` Lorenzo Allegrucci
2001-11-04 17:18   ` Linus Torvalds
2001-11-04 21:17   ` Lorenzo Allegrucci
2001-11-05  1:03     ` Linus Torvalds

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).