Scalability requirements for sysv ipc (was: ipc: store ipcs into IDRs)

All of lore.kernel.org
 help / color / mirror / Atom feed

* Scalability requirements for sysv ipc (was: ipc: store ipcs into IDRs)
@ 2008-03-21  9:41 Manfred Spraul
  2008-03-21 12:45 ` Nadia Derbey
  0 siblings, 1 reply; 27+ messages in thread
From: Manfred Spraul @ 2008-03-21  9:41 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Nadia Derbey, Andrew Morton, Paul E. McKenney

Hi all,

I noticed that sysv ipc now uses very special locking: first a global 
rw-semaphore, then within that semaphore rcu:
 > linux-2.6.25-rc3:/ipc/util.c:
> struct kern_ipc_perm *ipc_lock(struct ipc_ids *ids, int id)
> {
>         struct kern_ipc_perm *out;
>         int lid = ipcid_to_idx(id);
>
>         down_read(&ids->rw_mutex);
>
>         rcu_read_lock();
>         out = idr_find(&ids->ipcs_idr, lid);
ids->rw_mutex is a per-namespace (i.e.: usually global) semaphore. Thus 
ipc_lock writes into a global cacheline. Everything else is based on 
per-object locking, especially sysv sem doesn't contain a single global 
lock/statistic counter/...
That can't be the Right Thing (tm): Either there are cases where we need 
the scalability (then using IDRs is impossible), or the scalability is 
never needed (then the remaining parts from RCU should be removed).
I don't have a suitable test setup, has anyone performed benchmarks 
recently?
Is sysv semaphore still important, or have all apps moved to posix 
semaphores/futexes?
Nadia: Do you have access to a suitable benchmark?

A microbenchmark on a single-cpu system doesn't help much (except that 
2.6.25 is around factor 2 slower for sysv msg ping-pong between two 
tasks compared to the numbers I remember from older kernels....)

--
    Manfred

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Scalability requirements for sysv ipc (was: ipc: store ipcs into IDRs)
  2008-03-21  9:41 Scalability requirements for sysv ipc (was: ipc: store ipcs into IDRs) Manfred Spraul
@ 2008-03-21 12:45 ` Nadia Derbey
  2008-03-21 13:33   ` Scalability requirements for sysv ipc Manfred Spraul
  0 siblings, 1 reply; 27+ messages in thread
From: Nadia Derbey @ 2008-03-21 12:45 UTC (permalink / raw)
  To: Manfred Spraul; +Cc: Linux Kernel Mailing List, Andrew Morton, Paul E. McKenney

Manfred Spraul wrote:
> Hi all,
> 
> I noticed that sysv ipc now uses very special locking: first a global 
> rw-semaphore, then within that semaphore rcu:
>  > linux-2.6.25-rc3:/ipc/util.c:
> 
>> struct kern_ipc_perm *ipc_lock(struct ipc_ids *ids, int id)
>> {
>>         struct kern_ipc_perm *out;
>>         int lid = ipcid_to_idx(id);
>>
>>         down_read(&ids->rw_mutex);
>>
>>         rcu_read_lock();
>>         out = idr_find(&ids->ipcs_idr, lid);
> 
> ids->rw_mutex is a per-namespace (i.e.: usually global) semaphore. Thus 
> ipc_lock writes into a global cacheline. Everything else is based on 
> per-object locking, especially sysv sem doesn't contain a single global 
> lock/statistic counter/...
> That can't be the Right Thing (tm): Either there are cases where we need 
> the scalability (then using IDRs is impossible), or the scalability is 
> never needed (then the remaining parts from RCU should be removed).
> I don't have a suitable test setup, has anyone performed benchmarks 
> recently?
> Is sysv semaphore still important, or have all apps moved to posix 
> semaphores/futexes?
> Nadia: Do you have access to a suitable benchmark?
> 
> A microbenchmark on a single-cpu system doesn't help much (except that 
> 2.6.25 is around factor 2 slower for sysv msg ping-pong between two 
> tasks compared to the numbers I remember from older kernels....)
> 

If I remember well, at that time I had used ctxbench and I wrote some 
other small scripts.
And the results I had were around 2 or 3% slowdown, but I have to 
confirm that by checking in my archives.

I'll also have a look at the remaining RCU critical sections in the code.

Regards,
Nadia

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Scalability requirements for sysv ipc
  2008-03-21 12:45 ` Nadia Derbey
@ 2008-03-21 13:33   ` Manfred Spraul
  2008-03-21 14:13     ` Paul E. McKenney
  2008-03-25 16:00     ` Nadia Derbey
  0 siblings, 2 replies; 27+ messages in thread
From: Manfred Spraul @ 2008-03-21 13:33 UTC (permalink / raw)
  To: Nadia Derbey; +Cc: Linux Kernel Mailing List, Andrew Morton, Paul E. McKenney

Nadia Derbey wrote:
> Manfred Spraul wrote:
>>
>> A microbenchmark on a single-cpu system doesn't help much (except 
>> that 2.6.25 is around factor 2 slower for sysv msg ping-pong between 
>> two tasks compared to the numbers I remember from older kernels....)
>>
>
> If I remember well, at that time I had used ctxbench and I wrote some 
> other small scripts.
> And the results I had were around 2 or 3% slowdown, but I have to 
> confirm that by checking in my archives.
>
Do you have access to multi-core systems? The "best case" for the rcu 
code would be
- 8 or 16 cores
- one instance of ctxbench running on each core, bound to that core.

I'd expect a significant slowdown. The big question is if it matters.

--
    Manfred

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Scalability requirements for sysv ipc
  2008-03-21 13:33   ` Scalability requirements for sysv ipc Manfred Spraul
@ 2008-03-21 14:13     ` Paul E. McKenney
  2008-03-21 16:08       ` Manfred Spraul
  2008-03-25 16:00     ` Nadia Derbey
  1 sibling, 1 reply; 27+ messages in thread
From: Paul E. McKenney @ 2008-03-21 14:13 UTC (permalink / raw)
  To: Manfred Spraul; +Cc: Nadia Derbey, Linux Kernel Mailing List, Andrew Morton

On Fri, Mar 21, 2008 at 02:33:24PM +0100, Manfred Spraul wrote:
> Nadia Derbey wrote:
> >Manfred Spraul wrote:
> >>
> >>A microbenchmark on a single-cpu system doesn't help much (except 
> >>that 2.6.25 is around factor 2 slower for sysv msg ping-pong between 
> >>two tasks compared to the numbers I remember from older kernels....)
> >
> >If I remember well, at that time I had used ctxbench and I wrote some 
> >other small scripts.
> >And the results I had were around 2 or 3% slowdown, but I have to 
> >confirm that by checking in my archives.
> >
> Do you have access to multi-core systems? The "best case" for the rcu 
> code would be
> - 8 or 16 cores
> - one instance of ctxbench running on each core, bound to that core.
> 
> I'd expect a significant slowdown. The big question is if it matters.

I could give it a spin -- though I would need to be pointed to the
patch and the test.

						Thanx, Paul

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Scalability requirements for sysv ipc
  2008-03-21 14:13     ` Paul E. McKenney
@ 2008-03-21 16:08       ` Manfred Spraul
  2008-03-22  5:43         ` Mike Galbraith
  0 siblings, 1 reply; 27+ messages in thread
From: Manfred Spraul @ 2008-03-21 16:08 UTC (permalink / raw)
  To: paulmck; +Cc: Nadia Derbey, Linux Kernel Mailing List, Andrew Morton

Paul E. McKenney wrote:
> I could give it a spin -- though I would need to be pointed to the
> patch and the test.
>
>   
I'd just compare a recent kernel with something older, pre Fri Oct 19 
11:53:44 2007

Then download ctxbench, run one instance on each core, bound with taskset.
http://www.tmr.com/%7Epublic/source/
(I don't juse ctxbench myself, if it doesn't work then I could post my 
own app. It would be i386 only with RDTSCs inside)

I'll try to run it on my PentiumIII/850, right now I'm still setting 
everything up.

--
    Manfred

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Scalability requirements for sysv ipc
  2008-03-21 16:08       ` Manfred Spraul
@ 2008-03-22  5:43         ` Mike Galbraith
  2008-03-22 10:10           ` Manfred Spraul
  2008-03-27 22:29           ` Bill Davidsen
  0 siblings, 2 replies; 27+ messages in thread
From: Mike Galbraith @ 2008-03-22  5:43 UTC (permalink / raw)
  To: Manfred Spraul
  Cc: paulmck, Nadia Derbey, Linux Kernel Mailing List, Andrew Morton


On Fri, 2008-03-21 at 17:08 +0100, Manfred Spraul wrote: 
> Paul E. McKenney wrote:
> > I could give it a spin -- though I would need to be pointed to the
> > patch and the test.
> >
> >   
> I'd just compare a recent kernel with something older, pre Fri Oct 19 
> 11:53:44 2007
> 
> Then download ctxbench, run one instance on each core, bound with taskset.
> http://www.tmr.com/%7Epublic/source/
> (I don't juse ctxbench myself, if it doesn't work then I could post my 
> own app. It would be i386 only with RDTSCs inside)

(test gizmos are always welcome)

Results for Q6600 box don't look particularly wonderful.

taskset -c 3 ./ctx -s 

2.6.24.3
3766962 itterations in 9.999845 seconds = 376734/sec

2.6.22.18-cfs-v24.1
4375920 itterations in 10.006199 seconds = 437330/sec

for i in 0 1 2 3; do taskset -c $i ./ctx -s& done

2.6.22.18-cfs-v24.1
4355784 itterations in 10.005670 seconds = 435361/sec
4396033 itterations in 10.005686 seconds = 439384/sec
4390027 itterations in 10.006511 seconds = 438739/sec
4383906 itterations in 10.006834 seconds = 438128/sec

2.6.24.3
1269937 itterations in 9.999757 seconds = 127006/sec
1266723 itterations in 9.999663 seconds = 126685/sec
1267293 itterations in 9.999348 seconds = 126742/sec
1265793 itterations in 9.999766 seconds = 126592/sec

	-Mike


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Scalability requirements for sysv ipc
  2008-03-22  5:43         ` Mike Galbraith
@ 2008-03-22 10:10           ` Manfred Spraul
  2008-03-22 11:53             ` Mike Galbraith
  2008-03-27 22:29           ` Bill Davidsen
  1 sibling, 1 reply; 27+ messages in thread
From: Manfred Spraul @ 2008-03-22 10:10 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: paulmck, Nadia Derbey, Linux Kernel Mailing List, Andrew Morton

[-- Attachment #1: Type: text/plain, Size: 1060 bytes --]

Mike Galbraith wrote:
> taskset -c 3 ./ctx -s 
>
> 2.6.24.3
> 3766962 itterations in 9.999845 seconds = 376734/sec
>
> 2.6.22.18-cfs-v24.1
> 4375920 itterations in 10.006199 seconds = 437330/sec
>
> for i in 0 1 2 3; do taskset -c $i ./ctx -s& done
>
> 2.6.22.18-cfs-v24.1
> 4355784 itterations in 10.005670 seconds = 435361/sec
> 4396033 itterations in 10.005686 seconds = 439384/sec
> 4390027 itterations in 10.006511 seconds = 438739/sec
> 4383906 itterations in 10.006834 seconds = 438128/sec
>
> 2.6.24.3
> 1269937 itterations in 9.999757 seconds = 127006/sec
> 1266723 itterations in 9.999663 seconds = 126685/sec
> 1267293 itterations in 9.999348 seconds = 126742/sec
> 1265793 itterations in 9.999766 seconds = 126592/sec
>
>   
Ouch - 71% slowdown with just 4 cores. Wow.
Attached are my own testapps: one for sysv msg, one for sysv sem.
Could you run them? Taskset is done internally, just execute

$ for i in 1 2 3 4;do ./psem $i 5;./pmsg $i 5;done

Only tested on uniprocessor, I hope the pthread_setaffinity works as 
expected....

--
    Manfred

[-- Attachment #2: pmsg.cpp --]
[-- Type: text/plain, Size: 4665 bytes --]

/*
 * pmsg.cpp, parallel sysv msg pingpong
 *
 * Copyright (C) 1999, 2001, 2005, 2008 by Manfred Spraul.
 *	All rights reserved except the rights granted by the GPL.
 *
 * Redistribution of this file is permitted under the terms of the GNU 
 * General Public License (GPL) version 2 or later.
 * $Header$
 */

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <getopt.h>
#include <errno.h>
#include <sys/types.h>
#include <sys/ipc.h>
#include <sys/msg.h>
#include <pthread.h>

//////////////////////////////////////////////////////////////////////////////

static enum {
	WAITING,
	RUNNING,
	STOPPED,
} volatile g_state = WAITING;

unsigned long long *g_results;
int *g_svmsg_ids;
pthread_t *g_threads;

struct taskinfo {
	int svmsg_id;
	int threadid;
	int sender;
};

#define DATASIZE	8

void* worker_thread(void *arg)
{
	struct taskinfo *ti = (struct taskinfo*)arg;
	unsigned long long rounds;
	int ret;
	struct {
		long mtype;
		char buffer[DATASIZE];
	} mbuf;

	{
		cpu_set_t cpus;
		CPU_ZERO(&cpus);
		CPU_SET(ti->threadid/2, &cpus);
		printf("ti: %d %lxh\n", ti->threadid/2, cpus.__bits[0]);

		ret = pthread_setaffinity_np(g_threads[ti->threadid], sizeof(cpus), &cpus);
		if (ret < 0) {
			printf("pthread_setaffinity_np failed for thread %d with errno %d.\n",
					ti->threadid, errno);
		}

		ret = pthread_getaffinity_np(g_threads[ti->threadid], sizeof(cpus), &cpus);
		if (ret < 0) {
			printf("pthread_getaffinity_np() failed for thread %d with errno %d.\n",
					ti->threadid, errno);
			fflush(stdout);
		} else {
			printf("thread %d: sysvmsg %8d type %d bound to %lxh\n",ti->threadid,
					ti->svmsg_id, ti->sender, cpus.__bits[0]);
		}
		fflush(stdout);
	}

	rounds = 0;
	while(g_state == WAITING) {
#ifdef __i386__
		__asm__ __volatile__("pause": : :"memory");
#endif
	}

	if (ti->sender) {
		mbuf.mtype = ti->sender+1;
		ret = msgsnd(ti->svmsg_id, &mbuf, DATASIZE, 0);
		if (ret != 0) {
			printf("Initial send failed, errno %d.\n", errno);
			exit(1);
		}
	}
	while(g_state == RUNNING) {
		int target = 1+!ti->sender;

		ret = msgrcv(ti->svmsg_id, &mbuf, DATASIZE, target, 0);
		if (ret != DATASIZE) {
			if (errno == EIDRM)
				break;
			printf("Error on msgrcv, got %d, errno %d.\n", ret, errno);
			exit(1);
		}
		mbuf.mtype = ti->sender+1;
		ret = msgsnd(ti->svmsg_id, &mbuf, DATASIZE, 0);
		if (ret != 0) {
			if (errno == EIDRM)
				break;
			printf("send failed, errno %d.\n", errno);
			exit(1);
		}
		rounds++;
	}
	/* store result */
	g_results[ti->threadid] = rounds;

	pthread_exit(0);
	return NULL;
}

void init_thread(int thread1, int thread2)
{
	int ret;
	struct taskinfo *ti1, *ti2;

	ti1 = new (struct taskinfo);
	ti2 = new (struct taskinfo);
	if (!ti1 || !ti2) {
		printf("Could not allocate task info\n");
		exit(1);
	}

	g_svmsg_ids[thread1] = msgget(IPC_PRIVATE,0777|IPC_CREAT);
	if(g_svmsg_ids[thread1] == -1) {
		printf(" message queue create failed.\n");
		exit(1);
	}
	ti1->svmsg_id = g_svmsg_ids[thread1];
	ti2->svmsg_id = ti1->svmsg_id;
	ti1->threadid = thread1;
	ti2->threadid = thread2;
	ti1->sender = 1;
	ti2->sender = 0;

	ret = pthread_create(&g_threads[thread1], NULL, worker_thread, ti1);
	if (ret) {
		printf(" pthread_create failed with error code %d\n", ret);
		exit(1);
	}
	ret = pthread_create(&g_threads[thread2], NULL, worker_thread, ti2);
	if (ret) {
		printf(" pthread_create failed with error code %d\n", ret);
		exit(1);
	}
}

//////////////////////////////////////////////////////////////////////////////

int main(int argc, char **argv)
{
	int queues, timeout;
	unsigned long long totals;
	int i;

	printf("pmsg [nr queues] [timeout]\n");
	if (argc != 3) {
		printf(" Invalid parameters.\n");
		return 0;
	}
	queues = atoi(argv[1]);
	timeout = atoi(argv[2]);
	printf("Using %d queues (%d threads) for %d seconds.\n",
			queues, 2*queues, timeout);

	g_results = new unsigned long long[2*queues];
	g_svmsg_ids = new int[queues];
	g_threads = new pthread_t[2*queues];
	for (i=0;i<queues;i++) {
		g_results[i] = 0;
		g_results[i+queues] = 0;
		init_thread(i, i+queues);
	}

	sleep(1);
	g_state = RUNNING;
	sleep(timeout);
	g_state = STOPPED;
	sleep(1);
	for (i=0;i<queues;i++) {
		int res;
		res = msgctl(g_svmsg_ids[i],IPC_RMID,NULL);
		if (res < 0) {
			printf("msgctl(IPC_RMID) failed for %d, errno%d.\n",
				g_svmsg_ids[i], errno);
		}
	}
	for (i=0;i<2*queues;i++)
		pthread_join(g_threads[i], NULL);

	printf("Result matrix:\n");
	totals = 0;
	for (i=0;i<queues;i++) {
		printf("  Thread %3d: %8lld     %3d: %8lld\n",
				i, g_results[i], i+queues, g_results[i+queues]);
		totals += g_results[i] + g_results[i+queues];
	}
	printf("Total: %lld\n", totals);
}

[-- Attachment #3: psem.cpp --]
[-- Type: text/plain, Size: 4840 bytes --]

/*
 * psem.cpp, parallel sysv sem pingpong
 *
 * Copyright (C) 1999, 2001, 2005, 2008 by Manfred Spraul.
 *	All rights reserved except the rights granted by the GPL.
 *
 * Redistribution of this file is permitted under the terms of the GNU 
 * General Public License (GPL) version 2 or later.
 * $Header$
 */

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <getopt.h>
#include <errno.h>
#include <sys/types.h>
#include <sys/ipc.h>
#include <sys/sem.h>
#include <pthread.h>

//////////////////////////////////////////////////////////////////////////////

static enum {
	WAITING,
	RUNNING,
	STOPPED,
} volatile g_state = WAITING;

unsigned long long *g_results;
int *g_svsem_ids;
pthread_t *g_threads;

struct taskinfo {
	int svsem_id;
	int threadid;
	int sender;
};

#define DATASIZE	8

void* worker_thread(void *arg)
{
	struct taskinfo *ti = (struct taskinfo*)arg;
	unsigned long long rounds;
	int ret;

	{
		cpu_set_t cpus;
		CPU_ZERO(&cpus);
		CPU_SET(ti->threadid/2, &cpus);
		printf("ti: %d %lxh\n", ti->threadid/2, cpus.__bits[0]);

		ret = pthread_setaffinity_np(g_threads[ti->threadid], sizeof(cpus), &cpus);
		if (ret < 0) {
			printf("pthread_setaffinity_np failed for thread %d with errno %d.\n",
					ti->threadid, errno);
		}

		ret = pthread_getaffinity_np(g_threads[ti->threadid], sizeof(cpus), &cpus);
		if (ret < 0) {
			printf("pthread_getaffinity_np() failed for thread %d with errno %d.\n",
					ti->threadid, errno);
			fflush(stdout);
		} else {
			printf("thread %d: sysvsem %8d type %d bound to %lxh\n",ti->threadid,
					ti->svsem_id, ti->sender, cpus.__bits[0]);
		}
		fflush(stdout);
	}

	rounds = 0;
	while(g_state == WAITING) {
#ifdef __i386__
		__asm__ __volatile__("pause": : :"memory");
#endif
	}

	if (ti->sender) {
		struct sembuf sop[1];
		int res;

		/* 1) insert token */
		sop[0].sem_num=0;
		sop[0].sem_op=1;
		sop[0].sem_flg=0;
		res = semop(ti->svsem_id,sop,1);
	
		if (ret != 0) {
			printf("Initial semop failed, errno %d.\n", errno);
			exit(1);
		}
	}
	while(g_state == RUNNING) {
		struct sembuf sop[1];
		int res;

		/* 1) retrieve token */
		sop[0].sem_num=ti->sender;
		sop[0].sem_op=-1;
		sop[0].sem_flg=0;
		res = semop(ti->svsem_id,sop,1);
		if (ret != 0) {
			/* EIDRM can happen */
			if (errno == EIDRM)
				break;
			printf("main semop failed, errno %d.\n", errno);
			exit(1);
		}

		/* 2) reinsert token */
		sop[0].sem_num=1-ti->sender;
		sop[0].sem_op=1;
		sop[0].sem_flg=0;
		res = semop(ti->svsem_id,sop,1);
		if (ret != 0) {
			/* EIDRM can happen */
			if (errno == EIDRM)
				break;
			printf("main semop failed, errno %d.\n", errno);
			exit(1);
		}


		rounds++;
	}
	g_results[ti->threadid] = rounds;

	pthread_exit(0);
	return NULL;
}

void init_thread(int thread1, int thread2)
{
	int ret;
	struct taskinfo *ti1, *ti2;

	ti1 = new (struct taskinfo);
	ti2 = new (struct taskinfo);
	if (!ti1 || !ti2) {
		printf("Could not allocate task info\n");
		exit(1);
	}
	g_svsem_ids[thread1] = semget(IPC_PRIVATE,2,0777|IPC_CREAT);
	if(g_svsem_ids[thread1] == -1) {
		printf(" message queue create failed.\n");
		exit(1);
	}
	ti1->svsem_id = g_svsem_ids[thread1];
	ti2->svsem_id = ti1->svsem_id;
	ti1->threadid = thread1;
	ti2->threadid = thread2;
	ti1->sender = 1;
	ti2->sender = 0;

	ret = pthread_create(&g_threads[thread1], NULL, worker_thread, ti1);
	if (ret) {
		printf(" pthread_create failed with error code %d\n", ret);
		exit(1);
	}
	ret = pthread_create(&g_threads[thread2], NULL, worker_thread, ti2);
	if (ret) {
		printf(" pthread_create failed with error code %d\n", ret);
		exit(1);
	}
}

//////////////////////////////////////////////////////////////////////////////

int main(int argc, char **argv)
{
	int queues, timeout;
	unsigned long long totals;
	int i;

	printf("psem [nr queues] [timeout]\n");
	if (argc != 3) {
		printf(" Invalid parameters.\n");
		return 0;
	}
	queues = atoi(argv[1]);
	timeout = atoi(argv[2]);
	printf("Using %d queues (%d threads) for %d seconds.\n",
			queues, 2*queues, timeout);

	g_results = new unsigned long long[2*queues];
	g_svsem_ids = new int[queues];
	g_threads = new pthread_t[2*queues];
	for (i=0;i<queues;i++) {
		g_results[i] = 0;
		g_results[i+queues] = 0;
		init_thread(i, i+queues);
	}

	sleep(1);
	g_state = RUNNING;
	sleep(timeout);
	g_state = STOPPED;
	sleep(1);
	for (i=0;i<queues;i++) {
		int res;
		res = semctl(g_svsem_ids[i],1,IPC_RMID,NULL);
		if (res < 0) {
			printf("semctl(IPC_RMID) failed for %d, errno%d.\n",
				g_svsem_ids[i], errno);
		}
	}
	for (i=0;i<2*queues;i++)
		pthread_join(g_threads[i], NULL);

	printf("Result matrix:\n");
	totals = 0;
	for (i=0;i<queues;i++) {
		printf("  Thread %3d: %8lld     %3d: %8lld\n",
				i, g_results[i], i+queues, g_results[i+queues]);
		totals += g_results[i] + g_results[i+queues];
	}
	printf("Total: %lld\n", totals);
}

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Scalability requirements for sysv ipc
  2008-03-22 10:10           ` Manfred Spraul
@ 2008-03-22 11:53             ` Mike Galbraith
  2008-03-22 14:22               ` Manfred Spraul
  0 siblings, 1 reply; 27+ messages in thread
From: Mike Galbraith @ 2008-03-22 11:53 UTC (permalink / raw)
  To: Manfred Spraul
  Cc: paulmck, Nadia Derbey, Linux Kernel Mailing List, Andrew Morton


On Sat, 2008-03-22 at 11:10 +0100, Manfred Spraul wrote:

> Attached are my own testapps: one for sysv msg, one for sysv sem.
> Could you run them? Taskset is done internally, just execute
> 
> $ for i in 1 2 3 4;do ./psem $i 5;./pmsg $i 5;done

2.6.22.18-cfs-v24-smp                         2.6.24.3-smp
Result matrix: (psem)
  Thread   0:  2394885       1:  2394885         Thread   0:  2004534       1:  2004535
Total: 4789770                                Total: 4009069
Result matrix: (pmsg)
  Thread   0:  2345913       1:  2345914         Thread   0:  1971000       1:  1971000
Total: 4691827                                Total: 3942000


Result matrix:
  Thread   0:  1613610       2:  1613611          Thread   0:   477112       2:   477111
  Thread   1:  1613590       3:  1613590          Thread   1:   485607       3:   485607
Total: 6454401                                 Total: 1925437
Result matrix:
  Thread   0:  1409956       2:  1409956          Thread   0:   519398       2:   519398
  Thread   1:  1409776       3:  1409776          Thread   1:   519169       3:   519170
Total: 5639464                                 Total: 2077135 


Result matrix:
  Thread   0:   516309       3:   516309           Thread   0:   401157       3:   401157
  Thread   1:   318546       4:   318546           Thread   1:   408252       4:   408252
  Thread   2:   352940       5:   352940           Thread   2:   703600       5:   703600
Total: 2375590                                  Total: 3026018
Result matrix:
  Thread   0:   478356       3:   478356           Thread   0:   344738       3:   344739
  Thread   1:   241655       4:   241655           Thread   1:   343614       4:   343615
  Thread   2:   252444       5:   252445           Thread   2:   589298       5:   589299
Total: 1944911                                   Total: 2555303


Result matrix:
  Thread   0:   443392       4:   443392           Thread   0:   398491       4:   398491
  Thread   1:   443338       5:   443339           Thread   1:   398473       5:   398473
  Thread   2:   444069       6:   444070           Thread   2:   394647       6:   394648
  Thread   3:   444078       7:   444078           Thread   3:   394784       7:   394785
Total: 3549756                                   Total: 3172792
Result matrix:
  Thread   0:   354973       4:   354973           Thread   0:   331307       4:   331307
  Thread   1:   354966       5:   354966           Thread   1:   331220       5:   331221
  Thread   2:   358035       6:   358035           Thread   2:   322852       6:   322852
  Thread   3:   357877       7:   357877           Thread   3:   322899       7:   322899
Total: 2851702                                   Total: 2616557



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Scalability requirements for sysv ipc
  2008-03-22 11:53             ` Mike Galbraith
@ 2008-03-22 14:22               ` Manfred Spraul
  2008-03-22 19:08                 ` Manfred Spraul
  2008-03-22 19:35                 ` Scalability requirements for sysv ipc Mike Galbraith
  0 siblings, 2 replies; 27+ messages in thread
From: Manfred Spraul @ 2008-03-22 14:22 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: paulmck, Nadia Derbey, Linux Kernel Mailing List, Andrew Morton

[-- Attachment #1: Type: text/plain, Size: 260 bytes --]

Mike Galbraith wrote:
> Total: 4691827                                Total: 3942000
>   
Thanks. Unfortunately the test was buggy, it bound the tasks to the 
wrong cpu :-(
Could you run it again? Actually 1 cpu and 4 cpus are probably enough.

--
    Manfred

[-- Attachment #2: pmsg.cpp --]
[-- Type: text/plain, Size: 4653 bytes --]

/*
 * pmsg.cpp, parallel sysv msg pingpong
 *
 * Copyright (C) 1999, 2001, 2005, 2008 by Manfred Spraul.
 *	All rights reserved except the rights granted by the GPL.
 *
 * Redistribution of this file is permitted under the terms of the GNU 
 * General Public License (GPL) version 2 or later.
 * $Header$
 */

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <getopt.h>
#include <errno.h>
#include <sys/types.h>
#include <sys/ipc.h>
#include <sys/msg.h>
#include <pthread.h>

//////////////////////////////////////////////////////////////////////////////

static enum {
	WAITING,
	RUNNING,
	STOPPED,
} volatile g_state = WAITING;

unsigned long long *g_results;
int *g_svmsg_ids;
pthread_t *g_threads;

struct taskinfo {
	int svmsg_id;
	int threadid;
	int cpuid;
	int sender;
};

#define DATASIZE	8

void* worker_thread(void *arg)
{
	struct taskinfo *ti = (struct taskinfo*)arg;
	unsigned long long rounds;
	int ret;
	struct {
		long mtype;
		char buffer[DATASIZE];
	} mbuf;

	{
		cpu_set_t cpus;
		CPU_ZERO(&cpus);
		CPU_SET(ti->cpuid, &cpus);

		ret = pthread_setaffinity_np(g_threads[ti->threadid], sizeof(cpus), &cpus);
		if (ret < 0) {
			printf("pthread_setaffinity_np failed for thread %d with errno %d.\n",
					ti->threadid, errno);
		}

		ret = pthread_getaffinity_np(g_threads[ti->threadid], sizeof(cpus), &cpus);
		if (ret < 0) {
			printf("pthread_getaffinity_np() failed for thread %d with errno %d.\n",
					ti->threadid, errno);
			fflush(stdout);
		} else {
			printf("thread %d: sysvmsg %8d type %d bound to %04lxh\n",ti->threadid,
					ti->svmsg_id, ti->sender, cpus.__bits[0]);
		}
		fflush(stdout);
	}

	rounds = 0;
	while(g_state == WAITING) {
#ifdef __i386__
		__asm__ __volatile__("pause": : :"memory");
#endif
	}

	if (ti->sender) {
		mbuf.mtype = ti->sender+1;
		ret = msgsnd(ti->svmsg_id, &mbuf, DATASIZE, 0);
		if (ret != 0) {
			printf("Initial send failed, errno %d.\n", errno);
			exit(1);
		}
	}
	while(g_state == RUNNING) {
		int target = 1+!ti->sender;

		ret = msgrcv(ti->svmsg_id, &mbuf, DATASIZE, target, 0);
		if (ret != DATASIZE) {
			if (errno == EIDRM)
				break;
			printf("Error on msgrcv, got %d, errno %d.\n", ret, errno);
			exit(1);
		}
		mbuf.mtype = ti->sender+1;
		ret = msgsnd(ti->svmsg_id, &mbuf, DATASIZE, 0);
		if (ret != 0) {
			if (errno == EIDRM)
				break;
			printf("send failed, errno %d.\n", errno);
			exit(1);
		}
		rounds++;
	}
	/* store result */
	g_results[ti->threadid] = rounds;

	pthread_exit(0);
	return NULL;
}

void init_threads(int cpu, int cpus)
{
	int ret;
	struct taskinfo *ti1, *ti2;

	ti1 = new (struct taskinfo);
	ti2 = new (struct taskinfo);
	if (!ti1 || !ti2) {
		printf("Could not allocate task info\n");
		exit(1);
	}

	g_svmsg_ids[cpu] = msgget(IPC_PRIVATE,0777|IPC_CREAT);
	if(g_svmsg_ids[cpu] == -1) {
		printf(" message queue create failed.\n");
		exit(1);
	}

	g_results[cpu] = 0;
	g_results[cpu+cpus] = 0;

	ti1->svmsg_id = g_svmsg_ids[cpu];
	ti1->threadid = cpu;
	ti1->cpuid = cpu;
	ti1->sender = 1;
	ti2->svmsg_id = g_svmsg_ids[cpu];
	ti2->threadid = cpu+cpus;
	ti2->cpuid = cpu;
	ti2->sender = 0;

	ret = pthread_create(&g_threads[ti1->threadid], NULL, worker_thread, ti1);
	if (ret) {
		printf(" pthread_create failed with error code %d\n", ret);
		exit(1);
	}
	ret = pthread_create(&g_threads[ti2->threadid], NULL, worker_thread, ti2);
	if (ret) {
		printf(" pthread_create failed with error code %d\n", ret);
		exit(1);
	}
}

//////////////////////////////////////////////////////////////////////////////

int main(int argc, char **argv)
{
	int queues, timeout;
	unsigned long long totals;
	int i;

	printf("pmsg [nr queues] [timeout]\n");
	if (argc != 3) {
		printf(" Invalid parameters.\n");
		return 0;
	}
	queues = atoi(argv[1]);
	timeout = atoi(argv[2]);
	printf("Using %d queues/cpus (%d threads) for %d seconds.\n",
			queues, 2*queues, timeout);

	g_results = new unsigned long long[2*queues];
	g_svmsg_ids = new int[queues];
	g_threads = new pthread_t[2*queues];
	for (i=0;i<queues;i++) {
		init_threads(i, queues);
	}

	sleep(1);
	g_state = RUNNING;
	sleep(timeout);
	g_state = STOPPED;
	sleep(1);
	for (i=0;i<queues;i++) {
		int res;
		res = msgctl(g_svmsg_ids[i],IPC_RMID,NULL);
		if (res < 0) {
			printf("msgctl(IPC_RMID) failed for %d, errno%d.\n",
				g_svmsg_ids[i], errno);
		}
	}
	for (i=0;i<2*queues;i++)
		pthread_join(g_threads[i], NULL);

	printf("Result matrix:\n");
	totals = 0;
	for (i=0;i<queues;i++) {
		printf("  Thread %3d: %8lld     %3d: %8lld\n",
				i, g_results[i], i+queues, g_results[i+queues]);
		totals += g_results[i] + g_results[i+queues];
	}
	printf("Total: %lld\n", totals);
}

[-- Attachment #3: psem.cpp --]
[-- Type: text/plain, Size: 4823 bytes --]

/*
 * psem.cpp, parallel sysv sem pingpong
 *
 * Copyright (C) 1999, 2001, 2005, 2008 by Manfred Spraul.
 *	All rights reserved except the rights granted by the GPL.
 *
 * Redistribution of this file is permitted under the terms of the GNU 
 * General Public License (GPL) version 2 or later.
 * $Header$
 */

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <getopt.h>
#include <errno.h>
#include <sys/types.h>
#include <sys/ipc.h>
#include <sys/sem.h>
#include <pthread.h>

//////////////////////////////////////////////////////////////////////////////

static enum {
	WAITING,
	RUNNING,
	STOPPED,
} volatile g_state = WAITING;

unsigned long long *g_results;
int *g_svsem_ids;
pthread_t *g_threads;

struct taskinfo {
	int svsem_id;
	int threadid;
	int cpuid;
	int sender;
};

#define DATASIZE	8

void* worker_thread(void *arg)
{
	struct taskinfo *ti = (struct taskinfo*)arg;
	unsigned long long rounds;
	int ret;

	{
		cpu_set_t cpus;
		CPU_ZERO(&cpus);
		CPU_SET(ti->cpuid, &cpus);

		ret = pthread_setaffinity_np(g_threads[ti->threadid], sizeof(cpus), &cpus);
		if (ret < 0) {
			printf("pthread_setaffinity_np failed for thread %d with errno %d.\n",
					ti->threadid, errno);
		}

		ret = pthread_getaffinity_np(g_threads[ti->threadid], sizeof(cpus), &cpus);
		if (ret < 0) {
			printf("pthread_getaffinity_np() failed for thread %d with errno %d.\n",
					ti->threadid, errno);
			fflush(stdout);
		} else {
			printf("thread %d: sysvsem %8d type %d bound to %04lxh\n",ti->threadid,
					ti->svsem_id, ti->sender, cpus.__bits[0]);
		}
		fflush(stdout);
	}

	rounds = 0;
	while(g_state == WAITING) {
#ifdef __i386__
		__asm__ __volatile__("pause": : :"memory");
#endif
	}

	if (ti->sender) {
		struct sembuf sop[1];
		int res;

		/* 1) insert token */
		sop[0].sem_num=0;
		sop[0].sem_op=1;
		sop[0].sem_flg=0;
		res = semop(ti->svsem_id,sop,1);
	
		if (ret != 0) {
			printf("Initial semop failed, errno %d.\n", errno);
			exit(1);
		}
	}
	while(g_state == RUNNING) {
		struct sembuf sop[1];
		int res;

		/* 1) retrieve token */
		sop[0].sem_num=ti->sender;
		sop[0].sem_op=-1;
		sop[0].sem_flg=0;
		res = semop(ti->svsem_id,sop,1);
		if (ret != 0) {
			/* EIDRM can happen */
			if (errno == EIDRM)
				break;
			printf("main semop failed, errno %d.\n", errno);
			exit(1);
		}

		/* 2) reinsert token */
		sop[0].sem_num=1-ti->sender;
		sop[0].sem_op=1;
		sop[0].sem_flg=0;
		res = semop(ti->svsem_id,sop,1);
		if (ret != 0) {
			/* EIDRM can happen */
			if (errno == EIDRM)
				break;
			printf("main semop failed, errno %d.\n", errno);
			exit(1);
		}


		rounds++;
	}
	g_results[ti->threadid] = rounds;

	pthread_exit(0);
	return NULL;
}

void init_threads(int cpu, int cpus)
{
	int ret;
	struct taskinfo *ti1, *ti2;

	ti1 = new (struct taskinfo);
	ti2 = new (struct taskinfo);
	if (!ti1 || !ti2) {
		printf("Could not allocate task info\n");
		exit(1);
	}
	g_svsem_ids[cpu] = semget(IPC_PRIVATE,2,0777|IPC_CREAT);
	if(g_svsem_ids[cpu] == -1) {
		printf("sem array create failed.\n");
		exit(1);
	}

	g_results[cpu] = 0;
	g_results[cpu+cpus] = 0;

	ti1->svsem_id = g_svsem_ids[cpu];
	ti1->threadid = cpu;
	ti1->cpuid = cpu;
	ti1->sender = 1;
	ti2->svsem_id = g_svsem_ids[cpu];
	ti2->threadid = cpu+cpus;
	ti2->cpuid = cpu;
	ti2->sender = 0;

	ret = pthread_create(&g_threads[ti1->threadid], NULL, worker_thread, ti1);
	if (ret) {
		printf(" pthread_create failed with error code %d\n", ret);
		exit(1);
	}
	ret = pthread_create(&g_threads[ti2->threadid], NULL, worker_thread, ti2);
	if (ret) {
		printf(" pthread_create failed with error code %d\n", ret);
		exit(1);
	}
}

//////////////////////////////////////////////////////////////////////////////

int main(int argc, char **argv)
{
	int queues, timeout;
	unsigned long long totals;
	int i;

	printf("psem [nr queues] [timeout]\n");
	if (argc != 3) {
		printf(" Invalid parameters.\n");
		return 0;
	}
	queues = atoi(argv[1]);
	timeout = atoi(argv[2]);
	printf("Using %d queues/cpus (%d threads) for %d seconds.\n",
			queues, 2*queues, timeout);

	g_results = new unsigned long long[2*queues];
	g_svsem_ids = new int[queues];
	g_threads = new pthread_t[2*queues];
	for (i=0;i<queues;i++) {
		init_threads(i, queues);
	}

	sleep(1);
	g_state = RUNNING;
	sleep(timeout);
	g_state = STOPPED;
	sleep(1);
	for (i=0;i<queues;i++) {
		int res;
		res = semctl(g_svsem_ids[i],1,IPC_RMID,NULL);
		if (res < 0) {
			printf("semctl(IPC_RMID) failed for %d, errno%d.\n",
				g_svsem_ids[i], errno);
		}
	}
	for (i=0;i<2*queues;i++)
		pthread_join(g_threads[i], NULL);

	printf("Result matrix:\n");
	totals = 0;
	for (i=0;i<queues;i++) {
		printf("  Thread %3d: %8lld     %3d: %8lld\n",
				i, g_results[i], i+queues, g_results[i+queues]);
		totals += g_results[i] + g_results[i+queues];
	}
	printf("Total: %lld\n", totals);
}

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Scalability requirements for sysv ipc
  2008-03-22 14:22               ` Manfred Spraul
@ 2008-03-22 19:08                 ` Manfred Spraul
  2008-03-25 15:50                   ` Mike Galbraith
  2008-03-22 19:35                 ` Scalability requirements for sysv ipc Mike Galbraith
  1 sibling, 1 reply; 27+ messages in thread
From: Manfred Spraul @ 2008-03-22 19:08 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Mike Galbraith, paulmck, Nadia Derbey, Andrew Morton

[-- Attachment #1: Type: text/plain, Size: 636 bytes --]

Hi all,

I've revived my Dual-CPU Pentium III/850:
I couldn't notice a scalability-problem (two cpus are around 190%, but 
just the normal performance of 2.6.25-rc3 is abyssimal, 55 to 60% slower 
than 2.6.18.8:

psem    2.6.18  2.6.25  Diff [%]
1 cpu   948.005 398.435 -57,97
2 cpus  1.768.273       734.816 -58,44
Scalability [%] 193,26   192,21
                       
pmsg    2.6.18  2.6.25  Diff [%]
1 cpu   821.582 356.904 -56,56
2 cpus  1.488.058       661.754 -55,53
Scalability [%] 190,56   192,71

Attached are the .config files and the individual results.
Did I accidentially enable a scheduler debug option?

--
    Manfred

[-- Attachment #2: bench.tar.gz --]
[-- Type: application/x-gzip, Size: 38101 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Scalability requirements for sysv ipc
  2008-03-22 14:22               ` Manfred Spraul
  2008-03-22 19:08                 ` Manfred Spraul
@ 2008-03-22 19:35                 ` Mike Galbraith
  2008-03-23  6:38                   ` Manfred Spraul
  2008-03-23  7:08                   ` Mike Galbraith
  1 sibling, 2 replies; 27+ messages in thread
From: Mike Galbraith @ 2008-03-22 19:35 UTC (permalink / raw)
  To: Manfred Spraul
  Cc: paulmck, Nadia Derbey, Linux Kernel Mailing List, Andrew Morton


On Sat, 2008-03-22 at 15:22 +0100, Manfred Spraul wrote:
> Mike Galbraith wrote:
> > Total: 4691827                                Total: 3942000
> >   
> Thanks. Unfortunately the test was buggy, it bound the tasks to the 
> wrong cpu :-(
> Could you run it again? Actually 1 cpu and 4 cpus are probably enough.

Sure.  (ran as before, hopefully no transcription errors)

2.6.22.18-cfs-v24-smp                         2.6.24.3-smp
Result matrix: (psem)
  Thread   0:  2395778       1:  2395779         Thread   0:  2054990       1:  2054992
Total: 4791557                                Total: 4009069
Result matrix: (pmsg)
  Thread   0:  2317014       1:  2317015         Thread   0:  1959099       1:  1959099
Total: 4634029                                Total: 3918198


Result matrix:
  Thread   0:  2340716       2:  2340716         Thread   0:  1890292       2:  1890293
  Thread   1:  2361052       3:  2361052         Thread   1:  1899031       3:  1899032
Total: 9403536                                 Total: 7578648
Result matrix:
  Thread   0:  1429567       2:  1429567         Thread   0:  1295071       2:  1295071
  Thread   1:  1429267       3:  1429268         Thread   1:  1289253       3:  1289254
Total: 5717669                                 Total: 5168649 


Result matrix:
  Thread   0:  2263039       3:  2263039         Thread   0:  1351208       3:  1351209
  Thread   1:  2265120       4:  2265121         Thread   1:  1351300       4:  1351300
  Thread   2:  2263642       5:  2263642         Thread   2:  1319512       5:  1319512
Total: 13583603                                Total: 8044041
Result matrix:
  Thread   0:   483934       3:   483934         Thread   0:   514766       3:   514767
  Thread   1:   239714       4:   239715         Thread   1:   252764       4:   252765
  Thread   2:   270216       5:   270216         Thread   2:   253216       5:   253217
Total: 1987729                                 Total: 2041495


Result matrix:
  Thread   0:  2260038       4:  2260039         Thread   0:   642235       4:   642236
  Thread   1:  2262748       5:  2262749         Thread   1:   642742       5:   642743
  Thread   2:  2271236       6:  2271237         Thread   2:   640281       6:   640282
  Thread   3:  2257651       7:  2257652         Thread   3:   641931       7:   641931
Total: 18103350                                Total: 5134381
Result matrix:
  Thread   0:   382811       4:   382811         Thread   0:   342297       4:   342297
  Thread   1:   382801       5:   382802         Thread   1:   342309       5:   342310
  Thread   2:   376620       6:   376621         Thread   2:   343857       6:   343857
  Thread   3:   376559       7:   376559         Thread   3:   343836       7:   343836
Total: 3037584                                 Total: 2744599



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Scalability requirements for sysv ipc
  2008-03-22 19:35                 ` Scalability requirements for sysv ipc Mike Galbraith
@ 2008-03-23  6:38                   ` Manfred Spraul
  2008-03-23  7:15                     ` Mike Galbraith
  2008-03-23  7:08                   ` Mike Galbraith
  1 sibling, 1 reply; 27+ messages in thread
From: Manfred Spraul @ 2008-03-23  6:38 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: paulmck, Nadia Derbey, Linux Kernel Mailing List, Andrew Morton

Mike Galbraith wrote:
> On Sat, 2008-03-22 at 15:22 +0100, Manfred Spraul wrote:
>   
>> Mike Galbraith wrote:
>>     
>>> Total: 4691827                                Total: 3942000
>>>   
>>>       
>> Thanks. Unfortunately the test was buggy, it bound the tasks to the 
>> wrong cpu :-(
>> Could you run it again? Actually 1 cpu and 4 cpus are probably enough.
>>     
>
> Sure.  (ran as before, hopefully no transcription errors)
>
>   
Thanks:
sysv sem:
- 2.6.22 had almost linear scaling (up to 4 cores).
- 2.6.24.3 scales to 2 cpus, then it collapses. with 4 cores, it's 75% 
slower than 2.6.22.

sysv msg:
- neither 2.6.22 nor 2.6.24 scale very good. That's more or less 
expected, the message queue code contains a few global statistic 
counters (msg_hdrs, msg_bytes).

The cleanup of sysv is nice, but IMHO sysv sem should remain scalable - 
and a gloal semaphore with IDR can't be as scalable as the RCU protected 
array that was used before.

--
    Manfred

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Scalability requirements for sysv ipc
  2008-03-22 19:35                 ` Scalability requirements for sysv ipc Mike Galbraith
  2008-03-23  6:38                   ` Manfred Spraul
@ 2008-03-23  7:08                   ` Mike Galbraith
  2008-03-23  7:20                     ` Mike Galbraith
  1 sibling, 1 reply; 27+ messages in thread
From: Mike Galbraith @ 2008-03-23  7:08 UTC (permalink / raw)
  To: Manfred Spraul
  Cc: paulmck, Nadia Derbey, Linux Kernel Mailing List, Andrew Morton

[-- Attachment #1: Type: text/plain, Size: 1108 bytes --]


On Sat, 2008-03-22 at 20:35 +0100, Mike Galbraith wrote:
> On Sat, 2008-03-22 at 15:22 +0100, Manfred Spraul wrote:
> > Mike Galbraith wrote:
> > > Total: 4691827                                Total: 3942000
> > >   
> > Thanks. Unfortunately the test was buggy, it bound the tasks to the 
> > wrong cpu :-(
> > Could you run it again? Actually 1 cpu and 4 cpus are probably enough.
> 
> Sure.  (ran as before, hopefully no transcription errors)

Looking at the output over morning java, I noticed that pmsg didn't get
recompiled due to a fat finger, so those numbers are bogus.  Corrected
condensed version of output is below, charted data attached.

(hope evolution doesn't turn this into something other than plain text)



             1
             2
             3
             4
2.6.22.18-cfs-v24.1 psem
       4791557
       9403536
      13583603
      18103350
2.6.22.18-cfs-v24.1 pmsg
       4906249
       9171440
      13264752
      17774106
2.6.24.3 psem
       4009069
       7578648
       8044041
       5134381
2.6.24.3 pmsg
       3917588
       7290206
       7644794
       4824967


[-- Attachment #2: xxxx.pdf --]
[-- Type: application/pdf, Size: 16243 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Scalability requirements for sysv ipc
  2008-03-23  6:38                   ` Manfred Spraul
@ 2008-03-23  7:15                     ` Mike Galbraith
  0 siblings, 0 replies; 27+ messages in thread
From: Mike Galbraith @ 2008-03-23  7:15 UTC (permalink / raw)
  To: Manfred Spraul
  Cc: paulmck, Nadia Derbey, Linux Kernel Mailing List, Andrew Morton


On Sun, 2008-03-23 at 07:38 +0100, Manfred Spraul wrote:
> Mike Galbraith wrote:
> > On Sat, 2008-03-22 at 15:22 +0100, Manfred Spraul wrote:
> >   
> >> Mike Galbraith wrote:
> >>     
> >>> Total: 4691827                                Total: 3942000
> >>>   
> >>>       
> >> Thanks. Unfortunately the test was buggy, it bound the tasks to the 
> >> wrong cpu :-(
> >> Could you run it again? Actually 1 cpu and 4 cpus are probably enough.
> >>     
> >
> > Sure.  (ran as before, hopefully no transcription errors)
> >
> >   
> Thanks:
> sysv sem:
> - 2.6.22 had almost linear scaling (up to 4 cores).
> - 2.6.24.3 scales to 2 cpus, then it collapses. with 4 cores, it's 75% 
> slower than 2.6.22.
> 
> sysv msg:
> - neither 2.6.22 nor 2.6.24 scale very good. That's more or less 
> expected, the message queue code contains a few global statistic 
> counters (msg_hdrs, msg_bytes).

Actually, 2.6.22 is fine, and 2.6.24.3 is not, just as sysv sem.  I just
noticed that pmsg didn't get recompiled last night (fat finger) , and
sent a correction. 
 
	-Mike


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Scalability requirements for sysv ipc
  2008-03-23  7:08                   ` Mike Galbraith
@ 2008-03-23  7:20                     ` Mike Galbraith
  0 siblings, 0 replies; 27+ messages in thread
From: Mike Galbraith @ 2008-03-23  7:20 UTC (permalink / raw)
  To: Manfred Spraul
  Cc: paulmck, Nadia Derbey, Linux Kernel Mailing List, Andrew Morton


On Sun, 2008-03-23 at 08:08 +0100, Mike Galbraith wrote:
> On Sat, 2008-03-22 at 20:35 +0100, Mike Galbraith wrote:
> > On Sat, 2008-03-22 at 15:22 +0100, Manfred Spraul wrote:
> > > Mike Galbraith wrote:
> > > > Total: 4691827                                Total: 3942000
> > > >   
> > > Thanks. Unfortunately the test was buggy, it bound the tasks to the 
> > > wrong cpu :-(
> > > Could you run it again? Actually 1 cpu and 4 cpus are probably enough.
> > 
> > Sure.  (ran as before, hopefully no transcription errors)
> 
> Looking at the output over morning java, I noticed that pmsg didn't get
> recompiled due to a fat finger, so those numbers are bogus.  Corrected
> condensed version of output is below, charted data attached.
> 
> (hope evolution doesn't turn this into something other than plain text)

Pff, I'd rather have had the bounce.  Good thing I attached the damn
chart, evolution can't screw that up.

> 
> 
> 
>              1
>              2
>              3
>              4
> 2.6.22.18-cfs-v24.1 psem
>        4791557
>        9403536
>       13583603
>       18103350
> 2.6.22.18-cfs-v24.1 pmsg
>        4906249
>        9171440
>       13264752
>       17774106
> 2.6.24.3 psem
>        4009069
>        7578648
>        8044041
>        5134381
> 2.6.24.3 pmsg
>        3917588
>        7290206
>        7644794
>        4824967
> 


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Scalability requirements for sysv ipc
  2008-03-22 19:08                 ` Manfred Spraul
@ 2008-03-25 15:50                   ` Mike Galbraith
  2008-03-25 16:13                     ` Peter Zijlstra
  2008-03-30 14:12                     ` Scalability requirements for sysv ipc (+namespaces broken with SEM_UNDO) Manfred Spraul
  0 siblings, 2 replies; 27+ messages in thread
From: Mike Galbraith @ 2008-03-25 15:50 UTC (permalink / raw)
  To: Manfred Spraul
  Cc: Linux Kernel Mailing List, paulmck, Nadia Derbey, Andrew Morton,
	Peter Zijlstra

[-- Attachment #1: Type: text/plain, Size: 1879 bytes --]


On Sat, 2008-03-22 at 20:08 +0100, Manfred Spraul wrote:

> just the normal performance of 2.6.25-rc3 is abyssimal, 55 to 60% slower 
> than 2.6.18.8:

After manually reverting 3e148c79938aa39035669c1cfa3ff60722134535,
2.6.25.git scaled linearly, but as you noted, markedly down from earlier
kernels with this benchmark.  2.6.24.4 with same revert, but all
2.6.25.git ipc changes piled on top still performed close to 2.6.22, so
I went looking.  Bisection led me to..

8f4d37ec073c17e2d4aa8851df5837d798606d6f is first bad commit
commit 8f4d37ec073c17e2d4aa8851df5837d798606d6f
Author: Peter Zijlstra <a.p.zijlstra@chello.nl>
Date:   Fri Jan 25 21:08:29 2008 +0100

    sched: high-res preemption tick

    Use HR-timers (when available) to deliver an accurate preemption tick.

    The regular scheduler tick that runs at 1/HZ can be too coarse when nice
    level are used. The fairness system will still keep the cpu utilisation 'fair'
    by then delaying the task that got an excessive amount of CPU time but try to
    minimize this by delivering preemption points spot-on.

    The average frequency of this extra interrupt is sched_latency / nr_latency.
    Which need not be higher than 1/HZ, its just that the distribution within the
    sched_latency period is important.

    Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
    Signed-off-by: Ingo Molnar <mingo@elte.hu>

:040000 040000 ab225228500f7a19d5ad20ca12ca3fc8ff5f5ad1 f1742e1d225a72aecea9d6961ed989b5943d31d8 M     arch
:040000 040000 25d85e4ef7a71b0cc76801a2526ebeb4dce180fe ae61510186b4fad708ef0211ac169decba16d4e5 M     include
:040000 040000 9247cec7dd506c648ac027c17e5a07145aa41b26 950832cc1dc4d30923f593ecec883a06b45d62e9 M     kernel

..and I verified it via :-/ echo 7 > sched_features in latest.  That
only bought me roughly half though, so there's a part three in there
somewhere.

	-Mike

[-- Attachment #2: xxxx.pdf --]
[-- Type: application/pdf, Size: 17909 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Scalability requirements for sysv ipc
  2008-03-21 13:33   ` Scalability requirements for sysv ipc Manfred Spraul
  2008-03-21 14:13     ` Paul E. McKenney
@ 2008-03-25 16:00     ` Nadia Derbey
  1 sibling, 0 replies; 27+ messages in thread
From: Nadia Derbey @ 2008-03-25 16:00 UTC (permalink / raw)
  To: Manfred Spraul; +Cc: Linux Kernel Mailing List, Andrew Morton, Paul E. McKenney

Manfred Spraul wrote:
> Nadia Derbey wrote:
> 
>> Manfred Spraul wrote:
>>
>>>
>>> A microbenchmark on a single-cpu system doesn't help much (except 
>>> that 2.6.25 is around factor 2 slower for sysv msg ping-pong between 
>>> two tasks compared to the numbers I remember from older kernels....)
>>>
>>
>> If I remember well, at that time I had used ctxbench and I wrote some 
>> other small scripts.
>> And the results I had were around 2 or 3% slowdown, but I have to 
>> confirm that by checking in my archives.
>>
> Do you have access to multi-core systems? The "best case" for the rcu 
> code would be
> - 8 or 16 cores
> - one instance of ctxbench running on each core, bound to that core.
> 
> I'd expect a significant slowdown. The big question is if it matters.
> 
> -- 
>    Manfred
> 
> 

Hi,

Here is what I could find on my side:

=============================================================

lkernel@akt$ cat tst3/res_new/output
[root@akt tests]# echo 32768 > /proc/sys/kernel/msgmni
[root@akt tests]# ./msgbench_std_dev_plot -n
32768000 msgget iterations in 21.469724 seconds = 1526294/sec

32768000 msgsnd iterations in 18.891328 seconds = 1734583/sec

32768000 msgctl(ipc_stat) iterations in 15.359802 seconds = 2133472/sec

32768000 msgctl(msg_stat) iterations in 15.296114 seconds = 2142260/sec

32768000 msgctl(ipc_rmid) iterations in 32.981277 seconds = 993542/sec

             AVERAGE        STD_DEV      MIN     MAX
GET:        21469.724000   566.024657   19880   23607
SEND:       18891.328000   515.542311   18433   21962
IPC_STAT:   15359.802000   274.918673   15147   17166
MSG_STAT:   15296.114000   155.775508   15138   16790
RM:         32981.277000   675.621060   32141   35433


lkernel@akt$ cat tst3/res_ref/output
[root@akt tests]# echo 32768 > /proc/sys/kernel/msgmni
[root@akt tests]# ./msgbench_std_dev_plot -r
32768000 msgget iterations in 665.842852 seconds = 49213/sec

32768000 msgsnd iterations in 18.363853 seconds = 1784458/sec

32768000 msgctl(ipc_stat) iterations in 14.609669 seconds = 2243001/sec

32768000 msgctl(msg_stat) iterations in 14.774829 seconds = 2217950/sec

32768000 msgctl(ipc_rmid) iterations in 31.134984 seconds = 1052483/sec

             AVERAGE        STD_DEV      MIN     MAX
GET:        665842.852000   946.697555   654049   672208
SEND:       18363.853000   107.514954   18295   19563
IPC_STAT:   14609.669000   43.100272   14529   14881
MSG_STAT:   14774.829000   97.174924   14516   15436
RM:         31134.984000   444.612055   30521   33523


==================================================================

Unfortunately, I haven't kept the exact kernel release numbers, but the 
testing method was:
res_ref = unpatched kernel
res_new = same kernel release with my patches applied.

What I'll try to do is to re-run your tests (pmsg and psem) with this 
method (from my what I saw, the patches applied on a 2.6.23-rc4-mm1), 
but I can't do it before Thursday.

Regards,
Nadia

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Scalability requirements for sysv ipc
  2008-03-25 15:50                   ` Mike Galbraith
@ 2008-03-25 16:13                     ` Peter Zijlstra
  2008-03-25 18:31                       ` Mike Galbraith
  2008-03-26  6:18                       ` Mike Galbraith
  2008-03-30 14:12                     ` Scalability requirements for sysv ipc (+namespaces broken with SEM_UNDO) Manfred Spraul
  1 sibling, 2 replies; 27+ messages in thread
From: Peter Zijlstra @ 2008-03-25 16:13 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: Manfred Spraul, Linux Kernel Mailing List, paulmck, Nadia Derbey,
	Andrew Morton, Ingo Molnar

On Tue, 2008-03-25 at 16:50 +0100, Mike Galbraith wrote:
> On Sat, 2008-03-22 at 20:08 +0100, Manfred Spraul wrote:
> 
> > just the normal performance of 2.6.25-rc3 is abyssimal, 55 to 60% slower 
> > than 2.6.18.8:
> 
> After manually reverting 3e148c79938aa39035669c1cfa3ff60722134535,
> 2.6.25.git scaled linearly, but as you noted, markedly down from earlier
> kernels with this benchmark.  2.6.24.4 with same revert, but all
> 2.6.25.git ipc changes piled on top still performed close to 2.6.22, so
> I went looking.  Bisection led me to..
> 
> 8f4d37ec073c17e2d4aa8851df5837d798606d6f is first bad commit
> commit 8f4d37ec073c17e2d4aa8851df5837d798606d6f
> Author: Peter Zijlstra <a.p.zijlstra@chello.nl>
> Date:   Fri Jan 25 21:08:29 2008 +0100
> 
>     sched: high-res preemption tick
> 
>     Use HR-timers (when available) to deliver an accurate preemption tick.
> 
>     The regular scheduler tick that runs at 1/HZ can be too coarse when nice
>     level are used. The fairness system will still keep the cpu utilisation 'fair'
>     by then delaying the task that got an excessive amount of CPU time but try to
>     minimize this by delivering preemption points spot-on.
> 
>     The average frequency of this extra interrupt is sched_latency / nr_latency.
>     Which need not be higher than 1/HZ, its just that the distribution within the
>     sched_latency period is important.
> 
>     Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
>     Signed-off-by: Ingo Molnar <mingo@elte.hu>
> 
> :040000 040000 ab225228500f7a19d5ad20ca12ca3fc8ff5f5ad1 f1742e1d225a72aecea9d6961ed989b5943d31d8 M     arch
> :040000 040000 25d85e4ef7a71b0cc76801a2526ebeb4dce180fe ae61510186b4fad708ef0211ac169decba16d4e5 M     include
> :040000 040000 9247cec7dd506c648ac027c17e5a07145aa41b26 950832cc1dc4d30923f593ecec883a06b45d62e9 M     kernel
> 
> ...and I verified it via :-/ echo 7 > sched_features in latest.  That
> only bought me roughly half though, so there's a part three in there
> somewhere.

Ouch, I guess hrtimers are just way expensive on some hardware... 


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Scalability requirements for sysv ipc
  2008-03-25 16:13                     ` Peter Zijlstra
@ 2008-03-25 18:31                       ` Mike Galbraith
  2008-03-26  6:18                       ` Mike Galbraith
  1 sibling, 0 replies; 27+ messages in thread
From: Mike Galbraith @ 2008-03-25 18:31 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Manfred Spraul, Linux Kernel Mailing List, paulmck, Nadia Derbey,
	Andrew Morton, Ingo Molnar


On Tue, 2008-03-25 at 17:13 +0100, Peter Zijlstra wrote:
> On Tue, 2008-03-25 at 16:50 +0100, Mike Galbraith wrote:

> > ...and I verified it via :-/ echo 7 > sched_features in latest.  That
> > only bought me roughly half though, so there's a part three in there
> > somewhere.
> 
> Ouch, I guess hrtimers are just way expensive on some hardware... 

That would be about on par with my luck.  I'll try to muster up the
gumption to go looking for part three, though my motivation for
searching long ago proved to be a dead end wrt sysv ipc.

	-Mike


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Scalability requirements for sysv ipc
  2008-03-25 16:13                     ` Peter Zijlstra
  2008-03-25 18:31                       ` Mike Galbraith
@ 2008-03-26  6:18                       ` Mike Galbraith
  1 sibling, 0 replies; 27+ messages in thread
From: Mike Galbraith @ 2008-03-26  6:18 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Manfred Spraul, Linux Kernel Mailing List, paulmck, Nadia Derbey,
	Andrew Morton, Ingo Molnar


On Tue, 2008-03-25 at 17:13 +0100, Peter Zijlstra wrote:

> > ...and I verified it via :-/ echo 7 > sched_features in latest.  That
> > only bought me roughly half though, so there's a part three in there
> > somewhere.
> 
> Ouch, I guess hrtimers are just way expensive on some hardware... 

It takes a large bite out of my P4 as well.


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Scalability requirements for sysv ipc
  2008-03-22  5:43         ` Mike Galbraith
  2008-03-22 10:10           ` Manfred Spraul
@ 2008-03-27 22:29           ` Bill Davidsen
  2008-03-28  9:49             ` Manfred Spraul
  1 sibling, 1 reply; 27+ messages in thread
From: Bill Davidsen @ 2008-03-27 22:29 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: Manfred Spraul, paulmck, Nadia Derbey, Linux Kernel Mailing List,
	Andrew Morton

Mike Galbraith wrote:
> On Fri, 2008-03-21 at 17:08 +0100, Manfred Spraul wrote: 
>> Paul E. McKenney wrote:
>>> I could give it a spin -- though I would need to be pointed to the
>>> patch and the test.
>>>
>>>   
>> I'd just compare a recent kernel with something older, pre Fri Oct 19 
>> 11:53:44 2007
>>
>> Then download ctxbench, run one instance on each core, bound with taskset.
>> http://www.tmr.com/%7Epublic/source/
>> (I don't juse ctxbench myself, if it doesn't work then I could post my 
>> own app. It would be i386 only with RDTSCs inside)
> 
> (test gizmos are always welcome)
> 
> Results for Q6600 box don't look particularly wonderful.
> 
> taskset -c 3 ./ctx -s 
> 
> 2.6.24.3
> 3766962 itterations in 9.999845 seconds = 376734/sec
> 
> 2.6.22.18-cfs-v24.1
> 4375920 itterations in 10.006199 seconds = 437330/sec
> 
> for i in 0 1 2 3; do taskset -c $i ./ctx -s& done
> 
> 2.6.22.18-cfs-v24.1
> 4355784 itterations in 10.005670 seconds = 435361/sec
> 4396033 itterations in 10.005686 seconds = 439384/sec
> 4390027 itterations in 10.006511 seconds = 438739/sec
> 4383906 itterations in 10.006834 seconds = 438128/sec
> 
> 2.6.24.3
> 1269937 itterations in 9.999757 seconds = 127006/sec
> 1266723 itterations in 9.999663 seconds = 126685/sec
> 1267293 itterations in 9.999348 seconds = 126742/sec
> 1265793 itterations in 9.999766 seconds = 126592/sec
> 
Glad to see that ctxbench is still useful, I think there's a more recent 
version I haven't put up, which uses threads rather than processes, but 
there were similar values generated, so I somewhat lost interest. There 
was a "round robin" feature to pass the token through more processes, 
again I didn't find more use for the data.

I never tried binding the process to a CPU, in general the affinity code 
puts one process per CPU under light load, and limits the context switch 
overhead. It looks as if you are testing only the single CPU (or core) case.

-- 
Bill Davidsen <davidsen@tmr.com>
   "We have more to fear from the bungling of the incompetent than from
the machinations of the wicked."  - from Slashdot

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Scalability requirements for sysv ipc
  2008-03-27 22:29           ` Bill Davidsen
@ 2008-03-28  9:49             ` Manfred Spraul
  0 siblings, 0 replies; 27+ messages in thread
From: Manfred Spraul @ 2008-03-28  9:49 UTC (permalink / raw)
  To: Bill Davidsen
  Cc: Mike Galbraith, paulmck, Nadia Derbey, Linux Kernel Mailing List,
	Andrew Morton

[-- Attachment #1: Type: text/plain, Size: 1179 bytes --]

Bill Davidsen wrote:
>
> I never tried binding the process to a CPU, in general the affinity 
> code puts one process per CPU under light load, and limits the context 
> switch overhead. It looks as if you are testing only the single CPU 
> (or core) case.
>
Attached is a patch that I wrote that adds cpu binding. Feel free to add 
it to your sources. It's not that usefull, recent linux distros include 
a "taskset" command that can bind a task to a given cpu. I needed it for 
an older distro.

With regards to the multi-core case: I've always ignored them, I 
couldn't find a good/realistic test case.
Thundering herds (i.e.: one task wakes up lots of waiting tasks) is at 
least for sysv msg and sysv sem lockless: the woken up tasks do not take 
any locks, they return immediately to user space.
Additionally, I don't know if the test case is realistic: at least 
postgres uses one semaphore for each process/thread, thus waking up 
multiple tasks never happens.

Another case would be to bind both tasks to different cpus. I'm not sure 
if this happens in real life. Anyone around who knows how other 
databases implement locking? Is sysv sem still used?

--
    Manfred


[-- Attachment #2: patch-cpubind --]
[-- Type: text/plain, Size: 4313 bytes --]

diff -ur ctxbench-1.9.orig/ctxbench.c ctxbench-1.9/ctxbench.c
--- ctxbench-1.9.orig/ctxbench.c	2002-12-09 22:41:59.000000000 +0100
+++ ctxbench-1.9/ctxbench.c	2008-03-28 10:30:55.000000000 +0100
@@ -1,19 +1,28 @@
+#include <sched.h>
 #include <time.h>
 #include <errno.h>
 #include <stdio.h>
 #include <signal.h>
 #include <unistd.h>
-#include <sched.h>
 #include <sys/types.h>
 #include <sys/time.h>
 #include <sys/shm.h>
 #include <sys/sem.h>
 #include <sys/msg.h>
 #include <sys/stat.h>
+#include <stdlib.h>
+#include <sys/types.h>
+#include <sys/wait.h>
 
 /* this should be in unistd.h!! */
 /* #include <getopt.h> */
 
+/**************** Prototypes */
+
+void shmchild(int shm, int semid);
+void shmparent(int shm, int semid, pid_t child);
+void do_cpubind(int cpu);
+
 /**************** General internal procs and flags here */
 /*		help/usage */
 static void usage(void);
@@ -25,7 +34,6 @@
 int Niter = 0;
 /*		Use signals rather than semiphores */
 static void sig_NOP();
-static void wait_sig();
 int OkayToRun = 0;
 int ParentPID, ChildPID;
 /*		pipe vectors for -p option */
@@ -79,19 +87,20 @@
 
 int msgqid;
 int do_yield = 0;
-\f
-main(int argc, char *argv[])
+
+int main(int argc, char *argv[])
 {
 	int shm;
 	struct shmid_ds buf;
 	int semid = -1;
-	int child, stat;
+	int cpubind = -1;
+	int child;
 	int RunTime = 10;
 	union semun pvt_semun;
 
 	pvt_semun.val = 0;
 
-	while ((shm = getopt(argc, argv, "sSLYmpn:t:")) != EOF) {
+	while ((shm = getopt(argc, argv, "sSLYmpn:t:c:")) != EOF) {
 		switch (shm) {
 		/* these are IPC types */
 		case 's':	/* use semiphore */
@@ -124,11 +133,14 @@
 		case 't':	/* give time to run */
 			RunTime = atoi(optarg);
 			break;
+		case 'c':	/* bind to a specific cpu */
+			cpubind = atoi(optarg);
+			break;
 		default:	/* typo */
 			usage();
 		}
 	}
-\f
+
 	signal(SIGALRM, timeout);
 	if (RunTime) alarm(RunTime);
 
@@ -164,7 +176,7 @@
 	}
 
 	/* identify version and method */
-	printf("\n\nContext switching benchmark v1.17\n");
+	printf("\n\nContext switching benchmark v1.17-cpubind\n");
 	printf("  Using %s for IPC control\n", IPCname[IPCtype]);
 
 	printf("   Max iterations: %8d (zero = no limit)\n", Iterations);
@@ -174,13 +186,14 @@
 
 	ParentPID = getpid();
 	if ((child = fork()) == 0) {
+		do_cpubind(cpubind);
 		ChildPID = getpid();
 		shmchild(shm, semid);
 	} else {
+		do_cpubind(cpubind);
 		ChildPID = child;
 		shmparent(shm, semid, child);
 	}
-
 	wait(NULL);
 	if (shmctl(shm, IPC_RMID, &buf) != 0) {
 		perror("Error removing shared memory");
@@ -215,14 +228,13 @@
 		break;
 	}
 
-	exit(0);
+	return 0;
 }
-\f
 
 /*******************************/
 /*  child using IPC method */
 
-int shmchild(int shm, int semid)
+void shmchild(int shm, int semid)
 {
 	volatile char *mem;
 	int num = 0;
@@ -313,7 +325,7 @@
 /********************************/
 /*  parent using shared memory  */
 
-int shmparent(int shm, int semid, pid_t child)
+void shmparent(int shm, int semid, pid_t child)
 {
 	volatile char *mem;
 	int num = 0;
@@ -328,7 +340,7 @@
 
 
 	if (!(mem = shmat(shm, 0, 0))) {
-		perror("shmchild: Error attaching shared memory");
+		perror("shmparent: Error attaching shared memory");
 		exit(2);
 	}
 
@@ -439,7 +451,7 @@
 		exit(3);
 	}
 }
-\f
+
 /*****************************************************************
  | usage - give the user a clue
  ****************************************************************/
@@ -458,6 +470,7 @@
 		" -p   use pipes for IPC\n"
 		" -L   spinLock in shared memory\n"
 		" -Y   spinlock with sched_yield (for UP)\n"
+		" -cN  bind to cpu N\n"
 		"\nRun limit options:\n"
 		" -nN  limit loops to N (default via timeout)\n"
 		" -tN  run for N sec, default 10\n\n"
@@ -490,3 +503,22 @@
 	signal(SIGUSR1, sig_NOP);
 	return;
 }
+
+/*****************************************************************
+ | cpu_bind - bind all tasks to a given cpu
+ ****************************************************************/
+
+void do_cpubind(int cpubind)
+{
+	if (cpubind >= 0) {
+		cpu_set_t d;
+		int ret;
+
+		CPU_ZERO(&d);
+		CPU_SET(cpubind, &d);
+		ret = sched_setaffinity(0, sizeof(d), &d);
+		printf("%d: sched_setaffinity %d: %lxh\n",getpid(), ret, *((int*)&d));
+		ret = sched_getaffinity(0, sizeof(d), &d);
+		printf("%d: sched_getaffinity %d: %lxh\n",getpid(), ret, *((int*)&d));
+	}
+}

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Scalability requirements for sysv ipc (+namespaces broken with SEM_UNDO)
  2008-03-25 15:50                   ` Mike Galbraith
  2008-03-25 16:13                     ` Peter Zijlstra
@ 2008-03-30 14:12                     ` Manfred Spraul
  2008-03-30 15:21                       ` David Newall
  2008-03-30 17:18                       ` Mike Galbraith
  1 sibling, 2 replies; 27+ messages in thread
From: Manfred Spraul @ 2008-03-30 14:12 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: Linux Kernel Mailing List, paulmck, Nadia Derbey, Andrew Morton,
	Peter Zijlstra, Pavel Emelianov

Mike Galbraith wrote:
> On Sat, 2008-03-22 at 20:08 +0100, Manfred Spraul wrote:
>
>   
>> just the normal performance of 2.6.25-rc3 is abyssimal, 55 to 60% slower 
>> than 2.6.18.8:
>>     
>
> After manually reverting 3e148c79938aa39035669c1cfa3ff60722134535,
> 2.6.25.git scaled linearly
We can't just revert that patch: with IDR, a global lock is mandatory :-(
We must either revert the whole idea of using IDR or live with the 
reduced scalability.

Actually, there are further bugs: the undo structures are not 
namespace-aware, thus semop with SEM_UNDO, unshare, create new array 
with same id, but more semaphores, another semop with SEM_UNDO will 
corrupt kernel memory :-(
I'll try to clean up the bugs first, then I'll look at the scalability 
again.

--
    Manfred

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Scalability requirements for sysv ipc (+namespaces broken with SEM_UNDO)
  2008-03-30 14:12                     ` Scalability requirements for sysv ipc (+namespaces broken with SEM_UNDO) Manfred Spraul
@ 2008-03-30 15:21                       ` David Newall
  2008-03-30 17:18                       ` Mike Galbraith
  1 sibling, 0 replies; 27+ messages in thread
From: David Newall @ 2008-03-30 15:21 UTC (permalink / raw)
  To: Manfred Spraul
  Cc: Mike Galbraith, Linux Kernel Mailing List, paulmck, Nadia Derbey,
	Andrew Morton, Peter Zijlstra, Pavel Emelianov

Manfred Spraul wrote:
> Mike Galbraith wrote:
>> On Sat, 2008-03-22 at 20:08 +0100, Manfred Spraul wrote:
>>> just the normal performance of 2.6.25-rc3 is abyssimal, 55 to 60%
>>> slower than 2.6.18.8:
>>>     
>>
>> After manually reverting 3e148c79938aa39035669c1cfa3ff60722134535,
>> 2.6.25.git scaled linearly
> We can't just revert that patch: with IDR, a global lock is mandatory :-(
> We must either revert the whole idea of using IDR or live with the
> reduced scalability.
>
> Actually, there are further bugs: the undo structures are not
> namespace-aware, thus semop with SEM_UNDO, unshare, create new array
> with same id, but more semaphores, another semop with SEM_UNDO will
> corrupt kernel memory :-(

You should revert it all.  The scalability problem isn't good, but from
what you're saying, the idea isn't ready yet.  Revert it all, fix the
problems at your leisure, and submit new patches then.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Scalability requirements for sysv ipc (+namespaces broken with SEM_UNDO)
  2008-03-30 14:12                     ` Scalability requirements for sysv ipc (+namespaces broken with SEM_UNDO) Manfred Spraul
  2008-03-30 15:21                       ` David Newall
@ 2008-03-30 17:18                       ` Mike Galbraith
  2008-04-04 14:59                         ` Nadia Derbey
  1 sibling, 1 reply; 27+ messages in thread
From: Mike Galbraith @ 2008-03-30 17:18 UTC (permalink / raw)
  To: Manfred Spraul
  Cc: Linux Kernel Mailing List, paulmck, Nadia Derbey, Andrew Morton,
	Peter Zijlstra, Pavel Emelianov


On Sun, 2008-03-30 at 16:12 +0200, Manfred Spraul wrote:
> Mike Galbraith wrote:
> > On Sat, 2008-03-22 at 20:08 +0100, Manfred Spraul wrote:
> >
> >   
> >> just the normal performance of 2.6.25-rc3 is abyssimal, 55 to 60% slower 
> >> than 2.6.18.8:
> >>     
> >
> > After manually reverting 3e148c79938aa39035669c1cfa3ff60722134535,
> > 2.6.25.git scaled linearly
> We can't just revert that patch: with IDR, a global lock is mandatory :-(
> We must either revert the whole idea of using IDR or live with the 
> reduced scalability.

Yeah, I looked at the problem, but didn't know what the heck to do about
it, so just grabbed my axe to verify/quantify.

> Actually, there are further bugs: the undo structures are not 
> namespace-aware, thus semop with SEM_UNDO, unshare, create new array 
> with same id, but more semaphores, another semop with SEM_UNDO will 
> corrupt kernel memory :-(
> I'll try to clean up the bugs first, then I'll look at the scalability 
> again.

Great!

	-Mike



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Scalability requirements for sysv ipc (+namespaces broken with SEM_UNDO)
  2008-03-30 17:18                       ` Mike Galbraith
@ 2008-04-04 14:59                         ` Nadia Derbey
  2008-04-04 15:03                           ` Nadia Derbey
  0 siblings, 1 reply; 27+ messages in thread
From: Nadia Derbey @ 2008-04-04 14:59 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: Manfred Spraul, Linux Kernel Mailing List, paulmck,
	Andrew Morton, Peter Zijlstra, Pavel Emelianov

[-- Attachment #1: Type: text/plain, Size: 1940 bytes --]

Mike Galbraith wrote:
> On Sun, 2008-03-30 at 16:12 +0200, Manfred Spraul wrote:
> 
>>Mike Galbraith wrote:
>>
>>>On Sat, 2008-03-22 at 20:08 +0100, Manfred Spraul wrote:
>>>
>>>  
>>>
>>>>just the normal performance of 2.6.25-rc3 is abyssimal, 55 to 60% slower 
>>>>than 2.6.18.8:
>>>>    
>>>
>>>After manually reverting 3e148c79938aa39035669c1cfa3ff60722134535,
>>>2.6.25.git scaled linearly
>>
>>We can't just revert that patch: with IDR, a global lock is mandatory :-(
>>We must either revert the whole idea of using IDR or live with the 
>>reduced scalability.
> 
> 
> Yeah, I looked at the problem, but didn't know what the heck to do about
> it, so just grabbed my axe to verify/quantify.
> 
> 
>>Actually, there are further bugs: the undo structures are not 
>>namespace-aware, thus semop with SEM_UNDO, unshare, create new array 
>>with same id, but more semaphores, another semop with SEM_UNDO will 
>>corrupt kernel memory :-(
>>I'll try to clean up the bugs first, then I'll look at the scalability 
>>again.
> 
> 
> Great!
> 
> 	-Mike
> 
> 
> 
> 

I could get better results with the following solution:
wrote an RCU-based idr api (layers allocation is managed similarly to 
the radix-tree one)

Using it in the ipc code makes me get rid of the read lock taken in 
ipc_lock() (the one introduced in 3e148c79938aa39035669c1cfa3ff60722134535).

You'll find the results in attachment (kernel is 2.6.25-rc3-mm1).
output.25_rc3_mm1.ref.8  --> pmsg output for the 2.6.25-rc3-mm1
plot.25_rc3_mm1.ref.8    --> previous file results for use by gnuplot
output.25_rc3_mm1.ridr.8 --> pmsg output for the 2.6.25-rc3-mm1
                              + rcu-based idrs
plot.25_rc3_mm1.ridr.8   --> previous file results for use by gnuplot


I think I should be able to send a patch next week. It is presently an 
uggly code: I copied idr.c and idr.h into ridr.c and ridr.h to go fast, 
so didn't do any code factorization.

Regards
Nadia




[-- Attachment #2: output.25_rc3_mm1.ref.8 --]
[-- Type: text/x-troff-man, Size: 5846 bytes --]

pmsg [nr queues] [timeout]
Using 1 queues/cpus (2 threads) for 5 seconds.
thread 0: sysvmsg        0 type 1 bound to 0001h
thread 1: sysvmsg        0 type 0 bound to 0001h
Result matrix:
  Thread   0:   488650       1:   488650
Total: 977300
pmsg [nr queues] [timeout]
Using 2 queues/cpus (4 threads) for 5 seconds.
thread 0: sysvmsg    32768 type 1 bound to 0001h
thread 1: sysvmsg    65537 type 1 bound to 0002h
thread 3: sysvmsg    65537 type 0 bound to 0002h
thread 2: sysvmsg    32768 type 0 bound to 0001h
Result matrix:
  Thread   0:   223991       2:   223991
  Thread   1:   225588       3:   225588
Total: 899158
pmsg [nr queues] [timeout]
Using 3 queues/cpus (6 threads) for 5 seconds.
thread 0: sysvmsg    98304 type 1 bound to 0001h
thread 1: sysvmsg   131073 type 1 bound to 0002h
thread 2: sysvmsg   163842 type 1 bound to 0004h
thread 5: sysvmsg   163842 type 0 bound to 0004h
thread 4: sysvmsg   131073 type 0 bound to 0002h
thread 3: sysvmsg    98304 type 0 bound to 0001h
Result matrix:
  Thread   0:   183407       3:   183407
  Thread   1:   184030       4:   184030
  Thread   2:   357875       5:   357876
Total: 1450625
pmsg [nr queues] [timeout]
Using 4 queues/cpus (8 threads) for 5 seconds.
thread 0: sysvmsg   196608 type 1 bound to 0001h
thread 1: sysvmsg   229377 type 1 bound to 0002h
thread 2: sysvmsg   262146 type 1 bound to 0004h
thread 3: sysvmsg   294915 type 1 bound to 0008h
thread 5: sysvmsg   229377 type 0 bound to 0002h
thread 6: sysvmsg   262146 type 0 bound to 0004h
thread 7: sysvmsg   294915 type 0 bound to 0008h
thread 4: sysvmsg   196608 type 0 bound to 0001h
Result matrix:
  Thread   0:   166911       4:   166912
  Thread   1:   159281       5:   159281
  Thread   2:   166024       6:   166024
  Thread   3:   167440       7:   167440
Total: 1319313
pmsg [nr queues] [timeout]
Using 5 queues/cpus (10 threads) for 5 seconds.
thread 0: sysvmsg   327680 type 1 bound to 0001h
thread 2: sysvmsg   393218 type 1 bound to 0004h
thread 3: sysvmsg   425987 type 1 bound to 0008h
thread 4: sysvmsg   458756 type 1 bound to 0010h
thread 9: sysvmsg   458756 type 0 bound to 0010h
thread 6: sysvmsg   360449 type 0 bound to 0002h
thread 8: sysvmsg   425987 type 0 bound to 0008h
thread 7: sysvmsg   393218 type 0 bound to 0004h
thread 1: sysvmsg   360449 type 1 bound to 0002h
thread 5: sysvmsg   327680 type 0 bound to 0001h
Result matrix:
  Thread   0:    39740       5:    39740
  Thread   1:    40399       6:    40399
  Thread   2:    40326       7:    40327
  Thread   3:    39290       8:    39290
  Thread   4:    68684       9:    68685
Total: 456880
pmsg [nr queues] [timeout]
Using 6 queues/cpus (12 threads) for 5 seconds.
thread 0: sysvmsg   491520 type 1 bound to 0001h
thread 1: sysvmsg   524289 type 1 bound to 0002h
thread 2: sysvmsg   557058 type 1 bound to 0004h
thread 3: sysvmsg   589827 type 1 bound to 0008h
thread 4: sysvmsg   622596 type 1 bound to 0010h
thread 5: sysvmsg   655365 type 1 bound to 0020h
thread 6: sysvmsg   491520 type 0 bound to 0001h
thread 11: sysvmsg   655365 type 0 bound to 0020h
thread 10: sysvmsg   622596 type 0 bound to 0010h
thread 8: sysvmsg   557058 type 0 bound to 0004h
thread 9: sysvmsg   589827 type 0 bound to 0008h
thread 7: sysvmsg   524289 type 0 bound to 0002h
Result matrix:
  Thread   0:    27901       6:    27901
  Thread   1:    28554       7:    28555
  Thread   2:    28471       8:    28472
  Thread   3:    28015       9:    28016
  Thread   4:    28213      10:    28213
  Thread   5:    28396      11:    28396
Total: 339103
pmsg [nr queues] [timeout]
Using 7 queues/cpus (14 threads) for 5 seconds.
thread 0: sysvmsg   688128 type 1 bound to 0001h
thread 1: sysvmsg   720897 type 1 bound to 0002h
thread 2: sysvmsg   753666 type 1 bound to 0004h
thread 3: sysvmsg   786435 type 1 bound to 0008h
thread 4: sysvmsg   819204 type 1 bound to 0010h
thread 5: sysvmsg   851973 type 1 bound to 0020h
thread 6: sysvmsg   884742 type 1 bound to 0040h
thread 13: sysvmsg   884742 type 0 bound to 0040h
thread 7: sysvmsg   688128 type 0 bound to 0001h
thread 11: sysvmsg   819204 type 0 bound to 0010h
thread 12: sysvmsg   851973 type 0 bound to 0020h
thread 8: sysvmsg   720897 type 0 bound to 0002h
thread 10: sysvmsg   786435 type 0 bound to 0008h
thread 9: sysvmsg   753666 type 0 bound to 0004h
Result matrix:
  Thread   0:    12201       7:    12201
  Thread   1:    12451       8:    12452
  Thread   2:    12345       9:    12345
  Thread   3:    12277      10:    12278
  Thread   4:    12259      11:    12259
  Thread   5:    12364      12:    12365
  Thread   6:    24666      13:    24666
Total: 197129
pmsg [nr queues] [timeout]
Using 8 queues/cpus (16 threads) for 5 seconds.
thread 0: sysvmsg   917504 type 1 bound to 0001h
thread 1: sysvmsg   950273 type 1 bound to 0002h
thread 2: sysvmsg   983042 type 1 bound to 0004h
thread 3: sysvmsg  1015811 type 1 bound to 0008h
thread 4: sysvmsg  1048580 type 1 bound to 0010h
thread 5: sysvmsg  1081349 type 1 bound to 0020h
thread 6: sysvmsg  1114118 type 1 bound to 0040h
thread 7: sysvmsg  1146887 type 1 bound to 0080h
thread 15: sysvmsg  1146887 type 0 bound to 0080h
thread 8: sysvmsg   917504 type 0 bound to 0001h
thread 14: sysvmsg  1114118 type 0 bound to 0040h
thread 13: sysvmsg  1081349 type 0 bound to 0020h
thread 12: sysvmsg  1048580 type 0 bound to 0010h
thread 11: sysvmsg  1015811 type 0 bound to 0008h
thread 10: sysvmsg   983042 type 0 bound to 0004h
thread 9: sysvmsg   950273 type 0 bound to 0002h
Result matrix:
  Thread   0:    11082       8:    11083
  Thread   1:    11461       9:    11461
  Thread   2:    11430      10:    11431
  Thread   3:    11184      11:    11185
  Thread   4:    11373      12:    11374
  Thread   5:    11290      13:    11291
  Thread   6:    11265      14:    11266
  Thread   7:    11324      15:    11325
Total: 180825

[-- Attachment #3: plot.25_rc3_mm1.ref.8 --]
[-- Type: text/x-troff-man, Size: 74 bytes --]

1 977300
2 899158
3 1450625
4 1319313
5 456880
6 339103
7 197129
8 180825

[-- Attachment #4: output.25_rc3_mm1.ridr.8 --]
[-- Type: text/x-troff-man, Size: 5851 bytes --]

pmsg [nr queues] [timeout]
Using 1 queues/cpus (2 threads) for 5 seconds.
thread 0: sysvmsg        0 type 1 bound to 0001h
thread 1: sysvmsg        0 type 0 bound to 0001h
Result matrix:
  Thread   0:   549365       1:   549365
Total: 1098730
pmsg [nr queues] [timeout]
Using 2 queues/cpus (4 threads) for 5 seconds.
thread 0: sysvmsg    32768 type 1 bound to 0001h
thread 1: sysvmsg    65537 type 1 bound to 0002h
thread 3: sysvmsg    65537 type 0 bound to 0002h
thread 2: sysvmsg    32768 type 0 bound to 0001h
Result matrix:
  Thread   0:   245002       2:   245003
  Thread   1:   246618       3:   246619
Total: 983242
pmsg [nr queues] [timeout]
Using 3 queues/cpus (6 threads) for 5 seconds.
thread 0: sysvmsg    98304 type 1 bound to 0001h
thread 1: sysvmsg   131073 type 1 bound to 0002h
thread 2: sysvmsg   163842 type 1 bound to 0004h
thread 5: sysvmsg   163842 type 0 bound to 0004h
thread 4: sysvmsg   131073 type 0 bound to 0002h
thread 3: sysvmsg    98304 type 0 bound to 0001h
Result matrix:
  Thread   0:   231585       3:   231586
  Thread   1:   233256       4:   233256
  Thread   2:   509630       5:   509631
Total: 1948944
pmsg [nr queues] [timeout]
Using 4 queues/cpus (8 threads) for 5 seconds.
thread 0: sysvmsg   196608 type 1 bound to 0001h
thread 1: sysvmsg   229377 type 1 bound to 0002h
thread 2: sysvmsg   262146 type 1 bound to 0004h
thread 3: sysvmsg   294915 type 1 bound to 0008h
thread 5: sysvmsg   229377 type 0 bound to 0002h
thread 6: sysvmsg   262146 type 0 bound to 0004h
thread 7: sysvmsg   294915 type 0 bound to 0008h
thread 4: sysvmsg   196608 type 0 bound to 0001h
Result matrix:
  Thread   0:   233392       4:   233392
  Thread   1:   234485       5:   234486
  Thread   2:   235604       6:   235604
  Thread   3:   235683       7:   235683
Total: 1878329
pmsg [nr queues] [timeout]
Using 5 queues/cpus (10 threads) for 5 seconds.
thread 0: sysvmsg   327680 type 1 bound to 0001h
thread 2: sysvmsg   393218 type 1 bound to 0004h
thread 3: sysvmsg   425987 type 1 bound to 0008h
thread 4: sysvmsg   458756 type 1 bound to 0010h
thread 1: sysvmsg   360449 type 1 bound to 0002h
thread 9: sysvmsg   458756 type 0 bound to 0010h
thread 6: sysvmsg   360449 type 0 bound to 0002h
thread 7: sysvmsg   393218 type 0 bound to 0004h
thread 8: sysvmsg   425987 type 0 bound to 0008h
thread 5: sysvmsg   327680 type 0 bound to 0001h
Result matrix:
  Thread   0:   216094       5:   216095
  Thread   1:   227109       6:   227110
  Thread   2:   222042       7:   222042
  Thread   3:   222708       8:   222708
  Thread   4:   467186       9:   467187
Total: 2710281
pmsg [nr queues] [timeout]
Using 6 queues/cpus (12 threads) for 5 seconds.
thread 0: sysvmsg   491520 type 1 bound to 0001h
thread 1: sysvmsg   524289 type 1 bound to 0002h
thread 2: sysvmsg   557058 type 1 bound to 0004h
thread 3: sysvmsg   589827 type 1 bound to 0008h
thread 4: sysvmsg   622596 type 1 bound to 0010h
thread 5: sysvmsg   655365 type 1 bound to 0020h
thread 6: sysvmsg   491520 type 0 bound to 0001h
thread 11: sysvmsg   655365 type 0 bound to 0020h
thread 8: sysvmsg   557058 type 0 bound to 0004h
thread 10: sysvmsg   622596 type 0 bound to 0010h
thread 9: sysvmsg   589827 type 0 bound to 0008h
thread 7: sysvmsg   524289 type 0 bound to 0002h
Result matrix:
  Thread   0:   224027       6:   224028
  Thread   1:   225394       7:   225394
  Thread   2:   223545       8:   223545
  Thread   3:   223599       9:   223599
  Thread   4:   224632      10:   224633
  Thread   5:   224511      11:   224512
Total: 2691419
pmsg [nr queues] [timeout]
Using 7 queues/cpus (14 threads) for 5 seconds.
thread 0: sysvmsg   688128 type 1 bound to 0001h
thread 1: sysvmsg   720897 type 1 bound to 0002h
thread 2: sysvmsg   753666 type 1 bound to 0004h
thread 3: sysvmsg   786435 type 1 bound to 0008h
thread 4: sysvmsg   819204 type 1 bound to 0010h
thread 5: sysvmsg   851973 type 1 bound to 0020h
thread 6: sysvmsg   884742 type 1 bound to 0040h
thread 13: sysvmsg   884742 type 0 bound to 0040h
thread 8: sysvmsg   720897 type 0 bound to 0002h
thread 9: sysvmsg   753666 type 0 bound to 0004h
thread 10: sysvmsg   786435 type 0 bound to 0008h
thread 11: sysvmsg   819204 type 0 bound to 0010h
thread 7: sysvmsg   688128 type 0 bound to 0001h
thread 12: sysvmsg   851973 type 0 bound to 0020h
Result matrix:
  Thread   0:   188264       7:   188264
  Thread   1:   190677       8:   190677
  Thread   2:   188850       9:   188851
  Thread   3:   188925      10:   188926
  Thread   4:   190333      11:   190334
  Thread   5:   189235      12:   189235
  Thread   6:   386862      13:   386863
Total: 3046296
pmsg [nr queues] [timeout]
Using 8 queues/cpus (16 threads) for 5 seconds.
thread 0: sysvmsg   917504 type 1 bound to 0001h
thread 1: sysvmsg   950273 type 1 bound to 0002h
thread 2: sysvmsg   983042 type 1 bound to 0004h
thread 3: sysvmsg  1015811 type 1 bound to 0008h
thread 4: sysvmsg  1048580 type 1 bound to 0010h
thread 5: sysvmsg  1081349 type 1 bound to 0020h
thread 6: sysvmsg  1114118 type 1 bound to 0040h
thread 7: sysvmsg  1146887 type 1 bound to 0080h
thread 8: sysvmsg   917504 type 0 bound to 0001h
thread 10: sysvmsg   983042 type 0 bound to 0004h
thread 11: sysvmsg  1015811 type 0 bound to 0008h
thread 12: sysvmsg  1048580 type 0 bound to 0010h
thread 13: sysvmsg  1081349 type 0 bound to 0020h
thread 9: sysvmsg   950273 type 0 bound to 0002h
thread 15: sysvmsg  1146887 type 0 bound to 0080h
thread 14: sysvmsg  1114118 type 0 bound to 0040h
Result matrix:
  Thread   0:   187613       8:   187614
  Thread   1:   190488       9:   190489
  Thread   2:   190112      10:   190113
  Thread   3:   190374      11:   190375
  Thread   4:   190658      12:   190658
  Thread   5:   190508      13:   190508
  Thread   6:   189222      14:   189223
  Thread   7:   190272      15:   190272
Total: 3038499

[-- Attachment #5: plot.25_rc3_mm1.ridr.8 --]
[-- Type: text/x-troff-man, Size: 79 bytes --]

1 1098730
2 983242
3 1948944
4 1878329
5 2710281
6 2691419
7 3046296
8 3038499

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Scalability requirements for sysv ipc (+namespaces broken with SEM_UNDO)
  2008-04-04 14:59                         ` Nadia Derbey
@ 2008-04-04 15:03                           ` Nadia Derbey
  0 siblings, 0 replies; 27+ messages in thread
From: Nadia Derbey @ 2008-04-04 15:03 UTC (permalink / raw)
  To: Nadia Derbey
  Cc: Mike Galbraith, Manfred Spraul, Linux Kernel Mailing List,
	paulmck, Andrew Morton, Peter Zijlstra, Pavel Emelianov,
	NADIA DERBEY

Nadia Derbey wrote:
> Mike Galbraith wrote:
> 
>> On Sun, 2008-03-30 at 16:12 +0200, Manfred Spraul wrote:
>>
>>> Mike Galbraith wrote:
>>>
>>>> On Sat, 2008-03-22 at 20:08 +0100, Manfred Spraul wrote:
>>>>
>>>>  
>>>>
>>>>> just the normal performance of 2.6.25-rc3 is abyssimal, 55 to 60% 
>>>>> slower than 2.6.18.8:
>>>>>    
>>>>
>>>>
>>>> After manually reverting 3e148c79938aa39035669c1cfa3ff60722134535,
>>>> 2.6.25.git scaled linearly
>>>
>>>
>>> We can't just revert that patch: with IDR, a global lock is mandatory 
>>> :-(
>>> We must either revert the whole idea of using IDR or live with the 
>>> reduced scalability.
>>
>>
>>
>> Yeah, I looked at the problem, but didn't know what the heck to do about
>> it, so just grabbed my axe to verify/quantify.
>>
>>
>>> Actually, there are further bugs: the undo structures are not 
>>> namespace-aware, thus semop with SEM_UNDO, unshare, create new array 
>>> with same id, but more semaphores, another semop with SEM_UNDO will 
>>> corrupt kernel memory :-(
>>> I'll try to clean up the bugs first, then I'll look at the 
>>> scalability again.
>>
>>
>>
>> Great!
>>
>>     -Mike
>>
>>
>>
>>
> 
> I could get better results with the following solution:
> wrote an RCU-based idr api (layers allocation is managed similarly to 
> the radix-tree one)
> 
> Using it in the ipc code makes me get rid of the read lock taken in 
> ipc_lock() (the one introduced in 
> 3e148c79938aa39035669c1cfa3ff60722134535).
> 
> You'll find the results in attachment (kernel is 2.6.25-rc3-mm1).
> output.25_rc3_mm1.ref.8  --> pmsg output for the 2.6.25-rc3-mm1
> plot.25_rc3_mm1.ref.8    --> previous file results for use by gnuplot
> output.25_rc3_mm1.ridr.8 --> pmsg output for the 2.6.25-rc3-mm1
>                              + rcu-based idrs
> plot.25_rc3_mm1.ridr.8   --> previous file results for use by gnuplot
> 
> 
> I think I should be able to send a patch next week. It is presently an 
> uggly code: I copied idr.c and idr.h into ridr.c and ridr.h to go fast, 
> so didn't do any code factorization.
> 
> Regards
> Nadia
> 
> 
> 

Sorry forgot the command:

for i in 1 2 3 4 5 6 7 8;do ./pmsg $i 5;done > output.25_rc3_mm1.ref.8


Regards,
Nadia


^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2008-04-04 15:04 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-03-21  9:41 Scalability requirements for sysv ipc (was: ipc: store ipcs into IDRs) Manfred Spraul
2008-03-21 12:45 ` Nadia Derbey
2008-03-21 13:33   ` Scalability requirements for sysv ipc Manfred Spraul
2008-03-21 14:13     ` Paul E. McKenney
2008-03-21 16:08       ` Manfred Spraul
2008-03-22  5:43         ` Mike Galbraith
2008-03-22 10:10           ` Manfred Spraul
2008-03-22 11:53             ` Mike Galbraith
2008-03-22 14:22               ` Manfred Spraul
2008-03-22 19:08                 ` Manfred Spraul
2008-03-25 15:50                   ` Mike Galbraith
2008-03-25 16:13                     ` Peter Zijlstra
2008-03-25 18:31                       ` Mike Galbraith
2008-03-26  6:18                       ` Mike Galbraith
2008-03-30 14:12                     ` Scalability requirements for sysv ipc (+namespaces broken with SEM_UNDO) Manfred Spraul
2008-03-30 15:21                       ` David Newall
2008-03-30 17:18                       ` Mike Galbraith
2008-04-04 14:59                         ` Nadia Derbey
2008-04-04 15:03                           ` Nadia Derbey
2008-03-22 19:35                 ` Scalability requirements for sysv ipc Mike Galbraith
2008-03-23  6:38                   ` Manfred Spraul
2008-03-23  7:15                     ` Mike Galbraith
2008-03-23  7:08                   ` Mike Galbraith
2008-03-23  7:20                     ` Mike Galbraith
2008-03-27 22:29           ` Bill Davidsen
2008-03-28  9:49             ` Manfred Spraul
2008-03-25 16:00     ` Nadia Derbey

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.